[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-30 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

In D69498#2728451 , @JonChesterfield 
wrote:

> I wonder if it is necessary for the exec mask to be implicit state, managed 
> by a convergent/divergent abstraction.
>
> We could reify it as an argument passed to kernels (where all bits are set), 
> and adjusted by phi nodes at entry to basic blocks. Various intrinsics take 
> and return that reified value. __ballot, a comparison op, possibly load/store.
>
> At that point all the awkward control flow constraints are raised to data 
> flow, and I think (but haven't really chased this into the dark corners) that 
> solves the problems I know about. Notably, a uniform branch would be one 
> where the exec mask value was unchanged, so the associated phi is constant 
> folded.
>
> Makes a mess of the human readable IR, but possibly at a lower overall 
> complexity than continuing to handle divergence as control flow.

That is essentially what the current divergence analysis (Karrenberg and Hack) 
does anyway. Except that since we don't know how many threads are running 
concurrently, the mask is a symbolic value that can either be "divergent" or 
"uniform". If I am looking at this the right way, you seem to be saying two 
separate things: the notion of divergence/uniformity is what solves the problem 
at hand, and separately, the notion of divergence can be turned into explicit 
dataflow.

> edit: said reified mask, if integer, would also be the value post-volta sync 
> intrinsics take, where the developer is supposed to compute the mask across 
> branches and pass it around everywhere. Clang could therefore provide 
> builtins that lower to those sync intrinsics with an always correct mask 
> automatically passed. That would make volta much easier to program.

Additionally, the explicit mask (if we could agree on its type) would make it 
possible to have "dynamic uniformity". Currently we can only prove static 
uniformity.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment.

I wonder if it is necessary for the exec mask to be implicit state, managed by 
a convergent/divergent abstraction.

We could reify it as an argument passed to kernels (where all bits are set), 
and adjusted by phi nodes at entry to basic blocks. Various intrinsics take and 
return that reified value. __ballot, a comparison op, possibly load/store.

At that point all the awkward control flow constraints are raised to data flow, 
and I think (but haven't really chased this into the dark corners) that solves 
the problems I know about. Notably, a uniform branch would be one where the 
exec mask value was unchanged, so the associated phi is constant folded.

Makes a mess of the human readable IR, but possibly at a lower overall 
complexity than continuing to handle divergence as control flow.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-24 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

In D69498#2712706 , @nhaehnle wrote:

> In D69498#2711396 , @foad wrote:
>
>> I don't have much to add to the conversation except to point out that this 
>> definition defines `noconvergent` in terms of divergent control flow, but 
>> the langref itself doesn't define what divergent control flow //is//, which 
>> makes it an incomplete spec. (Perhaps I'm just restating @arsenm's 
>> objections.) This seems unsatisfactory to me but I have no idea what to do 
>> about it. I agree with @sameerds that the current definition of `convergent` 
>> is too restrictive because in practice we really do want to be able to move 
>> convergent calls past uniform control flow.
>
> That is one of the things that D85603  
> addresses. I suspect Sameer was assuming the language from there in the 
> background.
>
> I agree with Matt that it would be best to avoid too much TTI dependence. The 
> existing situation with the intrinsics is already a bit of a strange one. I 
> wonder if it is possible to move to a world where isSourceOfDivergence need 
> not be target-dependent. (This requires e.g. new attributes on function 
> arguments as well as new information about address spaces in the data layout, 
> plus some new attributes to define intrinsic data flow. Likely beyond the 
> scope of this patch.)

In general, yes, it is great to avoid dependence on TTI. But this particular 
instance is actually a dependence on DA. The fact that DA depends on TTI is a 
separate problem. But I think that's a natural fallout of the fact that LLVM is 
not (and should not be) in the business of defining an execution model for 
multiple threads. Every optimization that affects control flow needs to be able 
to reason about an execution model that LLVM itself does not define. So what we 
end up with is an approximate model in the form of information gleaned from the 
frontend as well as the target. Neither source is "more correct" or "less 
ideal" than the other.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-23 Thread Nicolai Hähnle via Phabricator via cfe-commits
nhaehnle added a comment.

In D69498#2711396 , @foad wrote:

> I don't have much to add to the conversation except to point out that this 
> definition defines `noconvergent` in terms of divergent control flow, but the 
> langref itself doesn't define what divergent control flow //is//, which makes 
> it an incomplete spec. (Perhaps I'm just restating @arsenm's objections.) 
> This seems unsatisfactory to me but I have no idea what to do about it. I 
> agree with @sameerds that the current definition of `convergent` is too 
> restrictive because in practice we really do want to be able to move 
> convergent calls past uniform control flow.

That is one of the things that D85603  
addresses. I suspect Sameer was assuming the language from there in the 
background.

I agree with Matt that it would be best to avoid too much TTI dependence. The 
existing situation with the intrinsics is already a bit of a strange one. I 
wonder if it is possible to move to a world where isSourceOfDivergence need to 
be target-dependent. (This requires e.g. new attributes on function arguments 
as well as new information about address spaces in the data layout, plus some 
new attributes to define intrinsic data flow. Likely beyond the scope of this 
patch.)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-23 Thread Jay Foad via Phabricator via cfe-commits
foad added a comment.

In D69498#2705441 , @sameerds wrote:

> I would propose refining the definition of the `noconvergent` attribute as 
> follows:
>
>> noconvergent:
>>
>> Some targets with a parallel execution model provide cross-thread operations 
>> whose outcome is affected by the presence of divergent control flow. We call 
>> such operations convergent. Optimizations that change control flow may 
>> affect the correctness of a program that uses convergent operations. In the 
>> presence of divergent control flow, such optimizations conservatively treat 
>> every call/invoke instruction as convergent by default. The noconvergent 
>> attribute relaxes this constraint as follows:
>>
>> - The noconvergent attribute can be added to a call/invoke to indicate that 
>> it is not affected by changes to the control flow that reaches it.
>> - The noconvergent attribute can be added to a function to indicate that it 
>> does not execute any convergent operations. A call/invoke automatically 
>> inherits the noconvergent attribute if it is set on the callee.

I don't have much to add to the conversation except to point out that this 
definition defines `noconvergent` in terms of divergent control flow, but the 
langref itself doesn't define what divergent control flow //is//, which makes 
it an incomplete spec. (Perhaps I'm just restating @arsenm's objections.) This 
seems unsatisfactory to me but I have no idea what to do about it. I agree with 
@sameerds that the current definition of `convergent` is too restrictive 
because in practice we really do want to be able to move convergent calls past 
uniform control flow.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-23 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

In D69498#2710675 , @arsenm wrote:

> In D69498#2709218 , @sameerds wrote:
>
>> 1. It is actually okay add control dependences to a convergent function as 
>> long as the new branches are uniform.
>> 2. Some times we hack a transform to avoid doing things that cannot be 
>> described as "add new control dependence", and still talk about the function 
>> having the above named attribute.
>>
>> There is another important bug that obfuscates the whole discussion: most 
>> places that use "isConvergent()" should in fact first check if it really 
>> matters to the surrounding control flow. There is too much loaded onto that 
>> one attribute, without sufficient context. The definition of "noconvergent" 
>> that I am proposing starts out by first fixing the definition of convergent 
>> itself. This definition is independent of target, and only talks about the 
>> properties of the control flow that reach the callsite. To repeat, this does 
>> not reinterpret the IR in a target-defined way. Like I said, in the new 
>> definition, all function calls are convergent even on CPUs, so I don't see 
>> where the target comes in. If you still insist on talking about 
>> interpretation of core attributes, please start by deconstructing the 
>> definition that I propose so I can see what I am missing.
>
> Yes, the current concept of convergent is broken. I think this whole recent 
> burst of conversation has been too focused on the current convergent 
> situation and this patch in particular. I think we need to conceptually 
> rebase this into the world that D85603  
> creates.

Actually that is exactly what this current conversation is doing. My definition 
of `noconvergent` first rebases the definition of `convergent` into what D85603 
 does. My wording is no different than the one 
you see here: https://reviews.llvm.org/D85603#change-WRNH9XUSSoR8

> The expected convergence behavior is not only a target property, but of the 
> source language. The frontend ultimately has to express the intended 
> semantics of these cross lane regions, and the convergence tokens gives this 
> ability.

That is true, and it is covered by the simple fact that we both agree that 
`convergent` needs to be the default unless specified by the frontend using 
`noconvergent`.

> My point about not making this target dependent is we shouldn't be trying to 
> shoehorn in TTI checks around convergent operations just to save frontends 
> from the inconvenience of adding another attribute to function declarations. 
> We can interprocedurally infer noconvergent like any other attribute most of 
> the time. The convergent calls and their tokens should be interpreted the 
> same even if the target CPU doesn't really have cross lane behavior.

This is not an attempt to shoehorn TTI checks for mere convenience. You missed 
the part about how the current use of `CallInst::isConvergent()` is broken. 
This is where the wording in D85603  inherits 
a major fault from the existing definition of convergent. It is wrong to say 
that optimizations are plain forbidden by the mere knowledge/assumption that 
some call is convergent. Those optimizations are forbidden only if divergent 
control flow is involved. The definition of `convergent` needs to refrain from 
being prescriptive about what the compiler can and cannot do. See the 
definition of `nosync` for comparison.

The convergent property merely says that the call is special, and then it is up 
to each optimization to decide what happens. That decision is based on whether 
the optimization affects divergent control flow incident on the call, and 
whether that effect is relevant to a convergent function. Now that's where the 
following reasoning comes into play:

1. Every use of isConvergent() needs to be tempered with knowledge of the 
control flow incident on it.
2. The only way to check for divergent control flow is to call on divergence 
analysis.
3. When divergence analysis is called on a target with no divergence, it 
returns `nullptr`, which is treated as equivalent to returning `uniform` for 
every query. This is not hypothetical, it's already happening whenever a 
transform does `if (!DA) { ... }`.
4. Pending an attempt to actually call DA at each of these places, a correct 
simplification is to assume uniform control flow when the target does not have 
divergence, and assume divergent control flow otherwise.

That's all there is to this. Like I keep saying, this is not a 
reinterpretation, nor is it a hack. From my first comment on the thread, it is 
an attempt to recast the whole question o be one about combining optimizations 
with target-specific information. Now before you balk at yet another use of the 
phrase "target-specific", please note that DA is totally target-specific. The 
"sources 

[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D69498#2709218 , @sameerds wrote:

> 1. It is actually okay add control dependences to a convergent function as 
> long as the new branches are uniform.
> 2. Some times we hack a transform to avoid doing things that cannot be 
> described as "add new control dependence", and still talk about the function 
> having the above named attribute.
>
> There is another important bug that obfuscates the whole discussion: most 
> places that use "isConvergent()" should in fact first check if it really 
> matters to the surrounding control flow. There is too much loaded onto that 
> one attribute, without sufficient context. The definition of "noconvergent" 
> that I am proposing starts out by first fixing the definition of convergent 
> itself. This definition is independent of target, and only talks about the 
> properties of the control flow that reach the callsite. To repeat, this does 
> not reinterpret the IR in a target-defined way. Like I said, in the new 
> definition, all function calls are convergent even on CPUs, so I don't see 
> where the target comes in. If you still insist on talking about 
> interpretation of core attributes, please start by deconstructing the 
> definition that I propose so I can see what I am missing.

Yes, the current concept of convergent is broken. I think this whole recent 
burst of conversation has been too focused on the current convergent situation 
and this patch in particular. I think we need to conceptually rebase this into 
the world that D85603  creates. The expected 
convergence behavior is not only a target property, but of the source language. 
The frontend ultimately has to express the intended semantics of these cross 
lane regions, and the convergence tokens gives this ability.

My point about not making this target dependent is we shouldn't be trying to 
shoehorn in TTI checks around convergent operations just to save frontends from 
the inconvenience of adding another attribute to function declarations. We can 
interprocedurally infer noconvergent like any other attribute most of the time. 
The convergent calls and their tokens should be interpreted the same even if 
the target CPU doesn't really have cross lane behavior.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-22 Thread Mehdi AMINI via Phabricator via cfe-commits
mehdi_amini added a comment.

What matter isn't "convergent" in itself, it is how it restricts 
transformations: if you know that all the control flow is always uniform, are 
there still restriction on any transformation in presence of convergent 
instructions?

If not, then the "target approach" seems nice: as @foad and @sameerds noted we 
just need to have a guarantee on these targets about the uniformity of the 
branches, it does not require to change the convergent semantics on a target 
basis I think.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-22 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

In D69498#2708990 , @arsenm wrote:

> You're still saying the same thing. This needs to be defined generically. 
> Frontends don't *have* to to anything, they'll just get the assumed 
> convergent behavior by default. Either frontends could always add 
> noconvergent to avoid any possible optimization hit, or we could have a pass 
> add noconvergent depending on the target. I don't want to change the 
> interpretation of core attributes based on the target

Like I said earlier, this not a reinterpretation base on target. I think we are 
not talking about the same "convergent" here. There is currently an attribute 
in LLVM with the spelling "c-o-n-v-e-r-g-e-n-t". It does not do what its name 
says. It has a prescriptive definition that says "thou shalt not add control 
dependences to this function". This is not what we actually need, because it 
fails in two ways:

1. It is actually okay add control dependences to a convergent function as long 
as the new branches are uniform.
2. Some times we hack a transform to avoid doing things that cannot be 
described as "add new control dependence", and still talk about the function 
having the above named attribute.

There is another important bug that obfuscates the whole discussion: most 
places that use "isConvergent()" should in fact first check if it really 
matters to the surrounding control flow. There is too much loaded onto that one 
attribute, without sufficient context. The definition of "noconvergent" that I 
am proposing starts out by first fixing the definition of convergent itself. 
This definition is independent of target, and only talks about the properties 
of the control flow that reach the callsite. To repeat, this does not 
reinterpret the IR in a target-defined way. Like I said, in the new definition, 
all function calls are convergent even on CPUs, so I don't see where the target 
comes in. If you still insist on talking about interpretation of core 
attributes, please start by deconstructing the definition that I propose so I 
can see what I am missing.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D69498#2707100 , @sameerds wrote:

>>> 1. The meaning of the `convergent` attribute has always been 
>>> target-dependent.
>>> 2. Dependence on TTI is not a real cost at all. We may eventually update 
>>> every use of isConvergent() to depend on a check for divergence. The check 
>>> for TTI is only the first step towards that.
>>
>> The core IR semantics must *not* depend on target interpretation. convergent 
>> has never been target dependent, and was defined in an abstract way. Only 
>> certain targets will really care, but we cannot directly define it to mean 
>> what we want it to mean for particular targets
>
> My bad. I should have said "the implementation of convergent" is target 
> dependent. But even that is not precise enough. It is more correct to say 
> that the convergent attribute can be implemented inside LLVM by depending on 
> TTI so that it only impacts targets that have divergence. This dependence is 
> not new; it's merely missing because the current uses of convergent 
> pessimistically assume divergent control flow on targets.

You're still saying the same thing. This needs to be defined generically. 
Frontends don't *have* to to anything, they'll just get the assumed convergent 
behavior by default. Either frontends could always add noconvergent to avoid 
any possible optimization hit, or we could have a pass add noconvergent 
depending on the target. I don't want to change the interpretation of core 
attributes based on the target


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-22 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

In D69498#2706524 , @arsenm wrote:

> In D69498#2705441 , @sameerds wrote:
>
>> I realize now that what @foad says above puts the idea in a clearer context. 
>> Essentially, any check for isConvergent() isn't happening in a vacuum. It is 
>> relevant only in the presence of divergent control flow, which in turn is 
>> relevant only when the target has divergence. Any standalone check for 
>> isConvergent() is merely making a pessimistic assumption that all the 
>> control flow incident on it is divergent. This has two consequences:
>>
>> 1. The meaning of the `convergent` attribute has always been 
>> target-dependent.
>> 2. Dependence on TTI is not a real cost at all. We may eventually update 
>> every use of isConvergent() to depend on a check for divergence. The check 
>> for TTI is only the first step towards that.
>
> The core IR semantics must *not* depend on target interpretation. convergent 
> has never been target dependent, and was defined in an abstract way. Only 
> certain targets will really care, but we cannot directly define it to mean 
> what we want it to mean for particular targets

My bad. I should have said "the implementation of convergent" is target 
dependent. But even that is not precise enough. It is more correct to say that 
the convergent attribute can be implemented inside LLVM by depending on TTI so 
that it only impacts targets that have divergence. This dependence is not new; 
it's merely missing because the current uses of convergent pessimistically 
assume divergent control flow on targets.

The important point is that frontends have to do nothing special for targets 
that don't have divergence, because the concepts of convergence and divergence 
have a trivial effect on optimizations. I had already observed in a previous 
comment that this new definition does not amount to reinterpreting the same 
program differently on different targets. It is perfectly okay to say that all 
calls are convergent on a CPU, because all branches are trivially uniform on 
the CPU (this part is inspired by @foad's observation), and hence the default 
convergent nature of calls has no effect on optimization for CPUs. If we were 
to rename `isConvergent()` to `hasConvergenceConstraints()` then we can see 
that it trivially returns `false` for any target that has no divergence.

Note that my proposed definition for `noconvergent` makes no mention of the 
target. The first sentence starts with "some targets", but that's just for a 
helpful context. The concept of convergent calls is defined purely in terms of 
divergent control flow. This one sentence very nicely ties everything up:

> In the presence of divergent control flow, such optimizations conservatively 
> treat every call/invoke instruction as convergent by default.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D69498#2705441 , @sameerds wrote:

> I realize now that what @foad says above puts the idea in a clearer context. 
> Essentially, any check for isConvergent() isn't happening in a vacuum. It is 
> relevant only in the presence of divergent control flow, which in turn is 
> relevant only when the target has divergence. Any standalone check for 
> isConvergent() is merely making a pessimistic assumption that all the control 
> flow incident on it is divergent. This has two consequences:
>
> 1. The meaning of the `convergent` attribute has always been target-dependent.
> 2. Dependence on TTI is not a real cost at all. We may eventually update 
> every use of isConvergent() to depend on a check for divergence. The check 
> for TTI is only the first step towards that.

The core IR semantics must *not* depend on target interpretation. convergent 
has never been target dependent, and was defined in an abstract way. Only 
certain targets will really care, but we cannot directly define it to mean what 
we want it to mean for particular targets


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-21 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

I realize now that what @foad says above puts the idea in a clearer context. 
Essentially, any check for isConvergent() isn't happening in a vacuum. It is 
relevant only in the presence of divergent control flow, which in turn is 
relevant only when the target has divergence. Any standalone check for 
isConvergent() is merely making a pessimistic assumption that all the control 
flow incident on it is divergent. This has two consequences:

1. The meaning of the `convergent` attribute has always been target-dependent.
2. Dependence on TTI is not a real cost at all. We may eventually update every 
use of isConvergent() to depend on a check for divergence. The check for TTI is 
only the first step towards that.

I would propose refining the definition of the `noconvergent` attribute as 
follows:

> noconvergent:
>
> Some targets with a parallel execution model provide cross-thread operations 
> whose outcome is affected by the presence of divergent control flow. We call 
> such operations convergent. Optimizations that change control flow may affect 
> the correctness of a program that uses convergent operations. In the presence 
> of divergent control flow, such optimizations conservatively treat every 
> call/invoke instruction as convergent by default. The noconvergent attribute 
> relaxes this constraint as follows:
>
> - The noconvergent attribute can be added to a call/invoke to indicate that 
> it is not affected by changes to the control flow that reaches it.
> - The noconvergent attribute can be added to a function to indicate that it 
> does not execute any convergent operations. A call/invoke automatically 
> inherits the noconvergent attribute if it is set on the callee.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-21 Thread Jay Foad via Phabricator via cfe-commits
foad added a comment.

> But in practice, the main issue for everyone is the effect on compile time 
> for targets that don't care about convergence/divergence. For such targets, 
> running even the divergence analysis is an unnecessary cost.

LegacyDivergenceAnalysis::runOnFunction bails out immediately if 
!hasBranchDivergence.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-21 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

In D69498#2704364 , @foad wrote:

>> This recasts the whole question to be one about combining optimizations with 
>> target-specific information. The only changes required are in transforms 
>> that check `CallInst::isConvergent()`. These should now also check `TTI`, 
>> possibly adding a dependency on the `TTI` analysis where it didn't exist 
>> earlier.
>
> @sameerds I agree with your conclusions but I would describe the situation a 
> little differently. As I understand it, the optimizations that check 
> isConvergent really only care about moving convergent calls past control flow 
> //that might be divergent//. !hasBranchDivergence is a promise that there are 
> no possible sources of divergence for a target, so you can run a divergence 
> analysis if you like but it will just tell you that everything is uniform, so 
> all control flow is uniform, so it's OK to move isConvergent calls around.
>
> In practice the optimizations that check isConvergent don't seem to use 
> divergence analysis, they just pessimistically assume that any control flow 
> might be divergent (if hasBranchDivergence). But they could and perhaps 
> should use divergence analysis, and then it would all just fall out in the 
> wash with no need for an explicit hasBranchDivergence test.

Sure it is formally the same thing. But in practice, the main issue for 
everyone is the effect on compile time for targets that don't care about 
convergence/divergence. For such targets, running even the divergence analysis 
is an unnecessary cost. Any checks for uniform control flow will still be 
hidden under TTI.hasBranchDivergence() for such targets. We see that happening 
already in places where divergence analysis is constructed optionally.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-21 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

The way I see it, the notion of convergence is relevant only to a certain class 
of targets (usually represented by GPUs) and it only affects certain 
optimizations. Then why not have only these optimizations check `TTI` to see if 
convergence matters? `TTI.hasBranchDivergence()` seems like a sufficient proxy 
for this information.

1. `convergent` becomes the default in LLVM IR, but it does not affect 
optimizations on non-GPU targets.
2. This is not a reinterpretation of the same IR on different targets. The 
notional execution model of LLVM IR will say that all function calls are 
convergent. Targets that only care about one thread at a time represent the 
degenerate case where all executions are convergent anyway.

This recasts the whole question to be one about combining optimizations with 
target-specific information. The only changes required are in transforms that 
check `CallInst::isConvergent()`. These should now also check `TTI`, possibly 
adding a dependency on the `TTI` analysis where it didn't exist earlier.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-21 Thread Jay Foad via Phabricator via cfe-commits
foad added a comment.

In D69498#2703317 , @sameerds wrote:

> The way I see it, the notion of convergence is relevant only to a certain 
> class of targets (usually represented by GPUs) and it only affects certain 
> optimizations. Then why not have only these optimizations check `TTI` to see 
> if convergence matters? `TTI.hasBranchDivergence()` seems like a sufficient 
> proxy for this information.
>
> 1. `convergent` becomes the default in LLVM IR, but it does not affect 
> optimizations on non-GPU targets.
> 2. This is not a reinterpretation of the same IR on different targets. The 
> notional execution model of LLVM IR will say that all function calls are 
> convergent. Targets that only care about one thread at a time represent the 
> degenerate case where all executions are convergent anyway.
>
> This recasts the whole question to be one about combining optimizations with 
> target-specific information. The only changes required are in transforms that 
> check `CallInst::isConvergent()`. These should now also check `TTI`, possibly 
> adding a dependency on the `TTI` analysis where it didn't exist earlier.

@sameerds I agree with your conclusions but I would describe the situation a 
little differently. As I understand it, the optimizations that check 
isConvergent really only care about moving convergent calls past control flow 
//that might be divergent//. !hasBranchDivergence is a promise that there are 
no possible sources of divergence for a target, so you can run a divergence 
analysis if you like but it will just tell you that everything is uniform, so 
all control flow is uniform, so it's OK to move isConvergent calls around.

In practice the optimizations that check isConvergent don't seem to use 
divergence analysis, they just pessimistically assume that any control flow 
might be divergent (if hasBranchDivergence). But they could and perhaps should 
use divergence analysis, and then it would all just fall out in the wash with 
no need for an explicit hasBranchDivergence test.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a subscriber: sameerds.
JonChesterfield added a comment.
Herald added subscribers: Naghasan, lxfind, okura, bbn, kuter, kerbowa, 
pengfei, jrtc27.
Herald added a reviewer: sstefan1.
Herald added a reviewer: baziotis.

I've managed to run a bug in some freestanding c++ compiled for amdgpu down to 
forgetting the convergent attribute in strategic places. OpenMP does now set 
the current attribute correctly, I think. I wonder if anyone has audited all 
the rocm device code to verify it is correctly annotated.

The trailing comments above are 'that works for me' and 'not strongly opposed'. 
@sameerds is this something you're looking at picking up? Patch has probably 
rotted a bit.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-11-27 Thread Mehdi AMINI via Phabricator via cfe-commits
mehdi_amini added a comment.

Talked with Matt at the last social, they thought I was strongly opposed to 
this: I am not, and I told them I would clarify here:

I don't buy the argument of "frontend projects have consistently run into 
problems related to this", I think this is not a good justification for this 
change. But this change seems like it can be motivated without this argument.
I also have been trying to help brainstorming alternatives just to see if we 
could get something that would fit what Matt is looking to achieve while 
getting past @rjmccall concerns, not because I am against this change.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-11-06 Thread Mehdi AMINI via Phabricator via cfe-commits
mehdi_amini added a comment.

In D69498#1735763 , @arsenm wrote:

> In D69498#1731265 , @mehdi_amini 
> wrote:
>
> > Just thought about a slight variation on this: what about adding a flag on 
> > the datalayout (or a module flag) but not use it in the 
> > transformations/analyses, instead use it only when creating Function by 
> > always setting the `convergent` attribute when the flag is present? This 
> > would require to always have a Module passed to the Function constructor 
> > though (the C API already does, we would just need to update the C++ users).
> >
> > So creating a Function in a Module with this flag would have the convergent 
> > attribute set on creation (can be unset explicitly).
>
>
> This sounds almost too low level. The bitcode loading of a function shouldn’t 
> be adding attributes to non-intrinsics, otherwise it’s changing the meaning 
> of the program and not preserving the input


You're right, but I thought loading a function from bitcode would be 1) create 
the function (it may get a default convergent attribute added) and 2) set the 
exact attributes (which would override the default)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-11-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D69498#1731265 , @mehdi_amini wrote:

> In D69498#1727650 , @dexonsmith 
> wrote:
>
> > In D69498#1723606 , @rjmccall 
> > wrote:
> >
> > > Perhaps there should be a global metadata, or something in the 
> > > increasingly-misnamed "data layout" string, which says that convergence 
> > > is meaningful, and we should only add the attribute in 
> > > appropriately-annotated modules?
> >
> >
> > Just wanted to resurface these alternatives from John.  Given that some 
> > targets want a fundamentally different default from what most frontends 
> > expect, I think we ought to find some way to encode the difference.
>
>
> Just thought about a slight variation on this: what about adding a flag on 
> the datalayout (or a module flag) but not use it in the 
> transformations/analyses, instead use it only when creating Function by 
> always setting the `convergent` attribute when the flag is present? This 
> would require to always have a Module passed to the Function constructor 
> though (the C API already does, we would just need to update the C++ users).
>
> So creating a Function in a Module with this flag would have the convergent 
> attribute set on creation (can be unset explicitly).


This sounds almost too low level. The bitcode loading of a function shouldn’t 
be adding attributes to non-intrinsics, otherwise it’s changing the meaning of 
the program and not preserving the input


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-11-03 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

In D69498#1731265 , @mehdi_amini wrote:

> In D69498#1727650 , @dexonsmith 
> wrote:
>
> > In D69498#1723606 , @rjmccall 
> > wrote:
> >
> > > Perhaps there should be a global metadata, or something in the 
> > > increasingly-misnamed "data layout" string, which says that convergence 
> > > is meaningful, and we should only add the attribute in 
> > > appropriately-annotated modules?
> >
> >
> > Just wanted to resurface these alternatives from John.  Given that some 
> > targets want a fundamentally different default from what most frontends 
> > expect, I think we ought to find some way to encode the difference.
>
>
> Just thought about a slight variation on this: what about adding a flag on 
> the datalayout (or a module flag) but not use it in the 
> transformations/analyses, instead use it only when creating Function by 
> always setting the `convergent` attribute when the flag is present? This 
> would require to always have a Module passed to the Function constructor 
> though (the C API already does, we would just need to update the C++ users).
>
> So creating a Function in a Module with this flag would have the convergent 
> attribute set on creation (can be unset explicitly).


That works for me.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-11-02 Thread Mehdi AMINI via Phabricator via cfe-commits
mehdi_amini added a comment.

In D69498#1727650 , @dexonsmith wrote:

> In D69498#1723606 , @rjmccall wrote:
>
> > Perhaps there should be a global metadata, or something in the 
> > increasingly-misnamed "data layout" string, which says that convergence is 
> > meaningful, and we should only add the attribute in appropriately-annotated 
> > modules?
>
>
> Just wanted to resurface these alternatives from John.  Given that some 
> targets want a fundamentally different default from what most frontends 
> expect, I think we ought to find some way to encode the difference.


Just thought about a slight variation on this: what about adding a flag on the 
datalayout (or a module flag) but not use it in the transformations/analyses, 
instead use it only when creating Function by always setting the `convergent` 
attribute when the flag is present? This would require to always have a Module 
passed to the Function constructor though (the C API already does, we would 
just need to update the C++ users).

So creating a Function in a Module with this flag would have the convergent 
attribute set on creation (can be unset explicitly).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-11-01 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia added a comment.

In D69498#1725692 , @kariddi wrote:

> In D69498#1725528 , @rjmccall wrote:
>
> > It absolutely makes sense for Clang as a GPU-programming frontend to set 
> > attributes appropriately when targeting the GPU.  I'm objecting to making 
> > "convergence" and related "code layout is semantics" properties a universal 
> > part of the IR semantics that literally every frontend has to know about in 
> > order to get reasonable behavior from LLVM.  I know that doing so makes 
> > sense to GPU backend developers because you mostly work exclusively on GPU 
> > toolchains, but AFAIK there are half a dozen GPU frontends and they're all 
> > forks of Clang, whereas there are dozens of LLVM frontends out there that 
> > don't care about targeting the GPU and quite possibly never will.  (And 
> > even if they do target GPUs, they often will not care about exposing thread 
> > groups; many programming environments are just interested in taking 
> > advantage of the GPU's parallel-computation power and have no interest in 
> > inter-thread interference.)
> >
> > John.
>
>
> I agree that the issue with making it "transparent" as a concept to whoever 
> is not interested in programming models that have the concept of SIMD-wide 
> execution is an open issue (not only related to this patch, but in general to 
> the convergent idea as a whole, where writing a llvm pass that maintains 
> convergence now is an extra burden to the developer of such pass that wasn't 
> there before and that is probably completely orthogonal to the interest of 
> the pass writer probably targeting C/C++ or other non-parallel languages). I 
> opened some discussions going on the other related RFC for extending the 
> concept of convergent to avoid hoisting as well regarding how are we gonna 
> avoid burdening the LLVM community and maintain the invariants we want with 
> respect to this concept.
>  I have no idea what the impact of the convergent attribute is in Clang (with 
> the exception of adding the flag itself), but a lot of the heavy-lifting I 
> know its in LVLM itself.
>
> That said I just want to point out that some of these languages run on CPUs 
> as well (like openCL ) and they share the same properties with respect to 
> instructions that read execution masks of neighboring threads and making sure 
> threads stay converged when executing them. It's certainly unfortunate that 
> LLVM has some deficiencies in supporting these concepts and that the so far 
> proposed solutions are not burden free for the rest of the community. :-/


Aside from functionality issues with inter-thread interference, there is 
another very related aspect - optimizing for SIMT execution. This topic hasn't 
been discussed much yet but optimizing for single threaded execution doesn't 
always yield optimal result when run in SIMT style. For example divergent code 
regions can significantly slow down execution of multiple threads even if they 
have small instruction count. I think the biggest gap in LLVM is absence of a 
way to represent SIMT concept in general i.e. by adding special annotations to 
functions that run in SIMT mode or adding annotation to the whole module. This 
can be useful not only to correctly implement `convergent` but to potentially 
setup separate optimization pass pipeline for GPU-like targets to allow 
excluding harmful transformations and potentially add more passes to middle end 
that are targeting SIMT.  Currently there is no solution to this at middle end 
and therefore each backend ends up adding the same logic during lowering phases 
instead. However, adding the concept of SIMT seems like a large architectural 
change and might not be something we can do quickly. Potentially we can 
identify some small steps towards this goal that can be done now to at least to 
support the `convergent` for SIMT targets correctly without affecting much 
non-SIMT community.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

As far as optimization inhibition is concerned, noconvergent will be inferred 
for all functions that don't call convergent intrinsics (i.e. the state of the 
world for all functions on all CPU targets). The frontend needing to do 
something for optimization comes up in relation to calling external 
declarations which won't be inferred in non-LTO contexts (plus inline asm and 
indirect calls, the other opaque call types). IMO the mild inconvenience of 
needing to add another attribute to optimize external call declarations is not 
unreasonable. This is already the situation for nounwind with languages that 
don't have exceptions. I don't see what real benefit there is to inventing some 
new inconsistent, target-specific IR construct to avoid frontends needing to 
add an attribute that behaves nearly identically to nounwind


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D69498#1728039 , @tra wrote:

> Perhaps we can deal with that by providing a way to specify per-module 
> default for the assumed convergence of the functions and then checking in the 
> back-end (only those that do care about convergence) that the default 
> convergence is explicitly set (and, perhaps, set to something specific?) via 
> function/module attributes or CLI.
>
> This way the unintentional use of vanilla IR w/o attributes with NVPTX 
> back-end will produce an error complaining that default convergence is not 
> set and we don't know if the IR is still sound. If necessary, the user can 
> set appropriate convergence wholesale via CLI or module attribute. The burden 
> on platforms that don't care about convergence will be limited to setting the 
> default and applying attributes on entities that do not match the default 
> assumption (there may be none of those).


Convergent can be correctly stripped in many cases (and noconvergent inferred 
in the same cases), so erroring if a function isn't convergent won't really work


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

Would I be too far off the mark to summarize the situation this way?

- current default for unattributed functions is unsound for GPU back-ends, but 
is fine for most of the other back-ends.
- it's easy to unintentionally end up using/mis-optimizing unattributed 
functions for back-ends where that care about convergence.
- making functions convergent by default everywhere is correct, but is overly 
pessimistic for back-ends that do not need it. Such back-ends will need to add 
explicit attributes in order to maintain current level of optimizations. In a 
way it's the other side of the current situation -- default is not good for 
some back-ends and we have to add attributes everywhere to make things work. 
Flipping the default improves correctness-by-default, but places logistical 
burden on back- and front-ends that may not care about convergence otherwise.

Perhaps we can deal with that by providing a way to specify per-module default 
for the assumed convergence of the functions and then checking in the back-end 
(only those that do care about convergence) that the default convergence is 
explicitly set (and, perhaps, set to something specific?) via function/module 
attributes or CLI.

This way the unintentional use of vanilla IR w/o attributes with NVPTX back-end 
will produce an error complaining that default convergence is not set and we 
don't know if the IR is still sound. If necessary, the user can set appropriate 
convergence wholesale via CLI or module attribute. The burden on platforms that 
don't care about convergence will be limited to setting the default and 
applying attributes on entities that do not match the default assumption (there 
may be none of those).

One pitfall of this approach is that we may run optimizations based on wrong 
assumptions and end up miscompiling things before we know what those 
assumptions will be (e.g. opt vs llc). Perhaps we can document the default 
assumption to be nonconvergent and always set a module attribute indicating the 
assumption used. This assumption will percolate through the optimization 
pipeline. If its the wrong one we would be able to reject / warn about it in 
the passes for the back-ends that have different preference.

Would something like that be a reasonable compromise?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment.

In D69498#1727626 , @mehdi_amini wrote:

> In D69498#1727546 , @jdoerfert wrote:
>
> > In D69498#1727419 , @mehdi_amini 
> > wrote:
> >
> > > In D69498#1727080 , @jdoerfert 
> > > wrote:
> > >
> > > > Let me quote @arsenm here because this is so important: "Basically no 
> > > > frontend has gotten this right, including clang and non-clang 
> > > > frontends." For me, that statement alone is reason to change the status 
> > > > quo.
> > >
> > >
> > > I am not convinced by this: this seems overly simplistic. The fact that 
> > > convergent wasn't rolled out properly in frontend can also been as an 
> > > artifact of the past.
> >
> >
> > Evidence suggest this problem will resurface over and over again, calling 
> > it an artifact of the past is dangerous and not helpful.
>
>
> Can you provide the "evidence"? So far the only thing I have seen is 
> historical and closely tied to how convergence was rolled out AFAICT.


Recently, as mentioned a few times before, OpenMP target offloading gets/got it 
wrong long after convergence was rolled out.
r238264 introduced convergent in 2015, r315094 fixed it in part for OpenCL in 
2017 (with the idea to remove noduplicate which was also broken!) a year after 
OpenCL support was started.

>>> There is a logical gap between this and concluding that the only solution 
>>> is to change the default.
>> 
>> There is no gap.
> 
> Bluntly contradicting what I wrote without apparently trying to understand my 
> point or trying to explain your point with respect to what I'm trying to 
> explain is not gonna get me to agree with you magically, I'm not sure what 
> you expect here other than shutting down the discussion and leaving us in a 
> frustrating disagreement and lack of understanding with this?

You removed my explanation that directly follows my statement. Please do not 
use my quotes out of context like this to blame me.

To reiterate what actually followed my above quote 
: If we have a "by default 
unsound behavior" we will by design have cases we miss, e.g., anything new not 
added with this special case in mind. So changing the default to the "by 
construction sound" case, is in my opinion the only way to prevent that. That 
being said, I already agreed that this might not be necessary for every target, 
e.g., based on the data layout. However, I do try to bring forward reasons why 
a "split in semantics" on this level will hurt us as a community and 
maintainers of the IR continuously.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

As I've said before, I absolutely view this polarity flip as good for GPU 
compilers, and I don't mind Clang needing to take some patches to update how we 
operate when targeting the GPU and to update some GPU-specific tests.  I do not 
think we should view GPU targets as "marginal".  But I do think that GPUs have 
different base semantics from CPUs (under normal, non-vector-transformed rules) 
and that it's not appropriate to unconditionally make those weaker GPU 
semantics the base semantics of LLVM IR.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D69498#1727626 , @mehdi_amini wrote:

> In D69498#1727546 , @jdoerfert wrote:
>
> > In D69498#1727419 , @mehdi_amini 
> > wrote:
> >
> > > In D69498#1727080 , @jdoerfert 
> > > wrote:
> > >
> > > > Let me quote @arsenm here because this is so important: "Basically no 
> > > > frontend has gotten this right, including clang and non-clang 
> > > > frontends." For me, that statement alone is reason to change the status 
> > > > quo.
> > >
> > >
> > > I am not convinced by this: this seems overly simplistic. The fact that 
> > > convergent wasn't rolled out properly in frontend can also been as an 
> > > artifact of the past.
> >
> >
> > Evidence suggest this problem will resurface over and over again, calling 
> > it an artifact of the past is dangerous and not helpful.
>
>
> Can you provide the "evidence"? So far the only thing I have seen is 
> historical and closely tied to how convergence was rolled out AFAICT.


I'm not sure what other evidence could exist for how something is handled in 
practice besides historical record. The current design requires a higher 
awareness at all times to avoid subtlety breaking code, which I think is 
fundamentally more error prone.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith added a comment.

In D69498#1723606 , @rjmccall wrote:

> Perhaps there should be a global metadata, or something in the 
> increasingly-misnamed "data layout" string, which says that convergence is 
> meaningful, and we should only add the attribute in appropriately-annotated 
> modules?


Just wanted to resurface these alternatives from John.  Given that some targets 
want a fundamentally different default from what most frontends expect, I think 
we ought to find some way to encode the difference.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Mehdi AMINI via Phabricator via cfe-commits
mehdi_amini added a comment.

In D69498#1727546 , @jdoerfert wrote:

> In D69498#1727419 , @mehdi_amini 
> wrote:
>
> > In D69498#1727080 , @jdoerfert 
> > wrote:
> >
> > > Let me quote @arsenm here because this is so important: "Basically no 
> > > frontend has gotten this right, including clang and non-clang frontends." 
> > > For me, that statement alone is reason to change the status quo.
> >
> >
> > I am not convinced by this: this seems overly simplistic. The fact that 
> > convergent wasn't rolled out properly in frontend can also been as an 
> > artifact of the past.
>
>
> Evidence suggest this problem will resurface over and over again, calling it 
> an artifact of the past is dangerous and not helpful.


Can you provide the "evidence"? So far the only thing I have seen is historical 
and closely tied to how convergence was rolled out AFAICT.

>> There is a logical gap between this and concluding that the only solution is 
>> to change the default.
> 
> There is no gap.

Bluntly contradicting what I wrote without apparently trying to understand my 
point or trying to explain your point with respect to what I'm trying to 
explain is not gonna get me to agree with you magically, I'm not sure what you 
expect here other than shutting down the discussion and leaving us in a 
frustrating disagreement and lack of understanding with this?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a subscriber: hfinkel.
jdoerfert added a comment.

In D69498#1727419 , @mehdi_amini wrote:

> In D69498#1727080 , @jdoerfert wrote:
>
> > Let me quote @arsenm here because this is so important: "Basically no 
> > frontend has gotten this right, including clang and non-clang frontends." 
> > For me, that statement alone is reason to change the status quo.
>
>
> I am not convinced by this: this seems overly simplistic. The fact that 
> convergent wasn't rolled out properly in frontend can also been as an 
> artifact of the past.


Evidence suggest this problem will resurface over and over again, calling it an 
artifact of the past is dangerous and not helpful.

> There is a logical gap between this and concluding that the only solution is 
> to change the default.

There is no gap. Making the default restrictive but sound will prevent various 
kinds of errors we have seen in the past. 
Assuming we could somehow "not repeat the errors in the future" is in my 
opinion the unrealistic view.

> For instance someone mentioned a pass that could be inserted as the very 
> beginning of the pipeline by any GPU compiler and add the convergent 
> attribute everywhere.

I talked to @hfinkel earlier about this and his idea was to use a pass (or 
IRBuilder mode, or sth.) to do the opposite: opt-into a "subset" of LLVM-IR 
behaviors.
The idea is that if you want just a low-level C for CPUs, you run this pass or 
enable this mode and you get the appropriate LLVM-IR. That would for this patch 
mean
you get `noconvergent` everywhere but in the future you could get other 
attributes/changes as well.

>>> As of the frontend, it seems to me that this is just about structuring the 
>>> code-path for function emission to define the right set of default 
>>> attribute. It isn't clear to me why a refactoring of the frontend isn't a 
>>> better course of actions here.
>> 
>> Whatever we do, there will be consequences for current and future 
>> front-ends. At the end of the day, the underlying question we need to answer 
>> here is: Do we want to have "optimization with opt-in soundness" or 
>> "soundness with opt-in optimizations", and I would choose the latter any 
>> time.
> 
> This view seems appropriate to me only if you consider a GPU (or SIMT) 
> compiler alone. Another view (which seems to be what @rjmccall has) is that 
> SIMT/GPU is a marginal use-case and "soundness" is already existing for most 
> LLVM use-cases.

I'm strongly opposing the idea to marginalize GPU, or any accelerator, targets. 
This is not only splitting the community by making it hostile towards some who 
have to work around whatever is deemed "the main use-case", it is also arguably 
a position of the past for many of us. Convenience for people that do not care 
about accelerators is a goal we should have in mind for sure, but it should not 
oppose a reasonable and sound support of accelerators.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D69498#1727419 , @mehdi_amini wrote:

> In D69498#1727080 , @jdoerfert wrote:
>
> > Let me quote @arsenm here because this is so important: "Basically no 
> > frontend has gotten this right, including clang and non-clang frontends." 
> > For me, that statement alone is reason to change the status quo.
>
>
> I am not convinced by this: this seems overly simplistic. The fact that 
> convergent wasn't rolled out properly in frontend can also been as an 
> artifact of the past. There is a logical gap between this and concluding that 
> the only solution is to change the default. For instance someone mentioned a 
> pass that could be inserted as the very beginning of the pipeline by any GPU 
> compiler and add the convergent attribute everywhere.


Avoiding human error is fundamental to good design. If you have to have an 
additional IR pass, then there's already a phase where the IR is broken and 
something could legally break the IR.

> 
> 
>>> As of the frontend, it seems to me that this is just about structuring the 
>>> code-path for function emission to define the right set of default 
>>> attribute. It isn't clear to me why a refactoring of the frontend isn't a 
>>> better course of actions here.
>> 
>> Whatever we do, there will be consequences for current and future 
>> front-ends. At the end of the day, the underlying question we need to answer 
>> here is: Do we want to have "optimization with opt-in soundness" or 
>> "soundness with opt-in optimizations", and I would choose the latter any 
>> time.
> 
> This view seems appropriate to me only if you consider a GPU (or SIMT) 
> compiler alone. Another view (which seems to be what @rjmccall has) is that 
> SIMT/GPU is a marginal use-case and "soundness" is already existing for most 
> LLVM use-cases.

I think this is a concerning argument. Declaring that GPUs are a "marginal" use 
case and LLVM only follows its design principles in cases where it benefits C++ 
x86/ARM users is an issue. In that case LLVM is no longer trying to be the 
compiler infrastructure for everyone that it tries to be. Soundness for most 
doesn't sound like a good design. I've proposed attributes in the past that 
were shot down for not following a correct by-default design and to me it seems 
like a problem if this principle isn't going to be universally followed. It's 
additionally concerning since most GPUs uses LLVM in some fashion if they're 
going to be declared a second class use case.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Mehdi AMINI via Phabricator via cfe-commits
mehdi_amini added a comment.

In D69498#1727080 , @jdoerfert wrote:

> Let me quote @arsenm here because this is so important: "Basically no 
> frontend has gotten this right, including clang and non-clang frontends." For 
> me, that statement alone is reason to change the status quo.


I am not convinced by this: this seems overly simplistic. The fact that 
convergent wasn't rolled out properly in frontend can also been as an artifact 
of the past. There is a logical gap between this and concluding that the only 
solution is to change the default. For instance someone mentioned a pass that 
could be inserted as the very beginning of the pipeline by any GPU compiler and 
add the convergent attribute everywhere.

>> As of the frontend, it seems to me that this is just about structuring the 
>> code-path for function emission to define the right set of default 
>> attribute. It isn't clear to me why a refactoring of the frontend isn't a 
>> better course of actions here.
> 
> Whatever we do, there will be consequences for current and future front-ends. 
> At the end of the day, the underlying question we need to answer here is: Do 
> we want to have "optimization with opt-in soundness" or "soundness with 
> opt-in optimizations", and I would choose the latter any time.

This view seems appropriate to me only if you consider a GPU (or SIMT) compiler 
alone. Another view (which seems to be what @rjmccall has) is that SIMT/GPU is 
a marginal use-case and "soundness" is already existing for most LLVM use-cases.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment.

In D69498#1726610 , @mehdi_amini wrote:

> > The short version is the fact that most developers aren't aware of and 
> > don't understand the subtleties of convergence, and making sure the 
> > front-end does something in all contexts requires a lot of diligence. It's 
> > very easy to introduce these difficult to debug bugs when calls are broken 
> > by default.
>
> The fact that most developers aren't aware of convergence means a lot of 
> potential bugs in LLVM because passes and transformations won't honor the 
> convergent semantics, but the default for the attribute won't change that.


While you are right, a default non-convergent behavior will not make all bug 
sources disappear, it will make *a lot of bug sources disappear*. Let me quote 
@arsenm here because this is so important: "Basically no frontend has gotten 
this right, including clang and non-clang frontends." For me, that statement 
alone is reason to change the status quo.

> As of the frontend, it seems to me that this is just about structuring the 
> code-path for function emission to define the right set of default attribute. 
> It isn't clear to me why a refactoring of the frontend isn't a better course 
> of actions here.

Whatever we do, there will be consequences for current and future front-ends. 
At the end of the day, the underlying question we need to answer here is: Do we 
want to have "optimization with opt-in soundness" or "soundness with opt-in 
optimizations", and I would choose the latter any time.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment.

Requiring the presence of an attribute for correctness is a bad thing. OpenMP 
was missing this annotation in a number of places and is probably still missing 
it elsewhere. I wouldn't bet on handwritten bitcode libraries getting it right 
everywhere either. An optimisation pass accidentally dropping the attribute 
seems a plausible failure mode as well.

Strongly in favour of replacing convergent with no{n}convergent in the IR.

Not as convinced it should be inserted by the front end. The attribute is 
needed before any CFG rewrites and as far as I know they all occur downstream 
of the front end. That suggests an IR pass that walks over all functions, 
intrinsic calls, inline asm and so forth and marks them as appropriate. 
Standard C++/x86 and similar don't need to run the pass, OpenCL/x86 probably 
does. I'd suggest running it manually across handwritten bitcode as a sanity 
check as well.

Of course, //if// a front end targeting gpus wants to change control flow 
(julia may do this, mlir does, I sincerely hope clang doesn't) //and// use this 
attribute to control the process, then that front end picks up the 
responsibility for inserting the attribute where it wants it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-30 Thread Mehdi AMINI via Phabricator via cfe-commits
mehdi_amini added a comment.

In D69498#1725867 , @jdoerfert wrote:

> In D69498#1725819 , @mehdi_amini 
> wrote:
>
> > Maybe we can start by looking into the motivation for this patch:
> >
> > > There is a burden on frontends in environments that care about convergent 
> > > operations to add the attribute just in case it is needed. This has 
> > > proven to be somewhat problematic, and different frontend projects have 
> > > consistently run into problems related to this.
> >
> > Can you clarify what kind of problems? Do you have concrete examples?
>
>
> The frontend has to put convergent on everything that is not known to be 
> non-convergent right now. That is, the front-end and all function generating 
> constructs need to know about convergent to preserve correctness.


Right, but that does not seem like a problem to me so far: a GPU frontend must 
add a gpu-specific attribute on every function seems like a reasonable 
constraint to me.

> Frontends in the past put convergent only on barriers or similar calls but 
> not on all functions that might transitively call barriers, leading to subtle 
> bugs.

Right, but it seems like trying to fix bugs in the frontend at the LLVM does 
not seems necessarily right: it looks like a big hammer.

In D69498#1725877 , @arsenm wrote:

> Basically no frontend has gotten this right, including clang and non-clang 
> frontends. There's a gradual discovery process that it's necessary in the 
> first place, and then a long tail of contexts where it's discovered it's 
> missing. As I mentioned before, SYCL and OpenMP are both still broken in this 
> regard. For example in clang, convergent wasn't initially added to all 
> functions, and then it was discovered that transforms on indirect callers 
> violated it.


The rule seems fairly straightforward for the frontend: the attributes must be 
always added everywhere if you're using SIMT.
Originally it likely wasn't thought through for convergent, but at this point 
this seems behind us.

> Then later it was discovered this was broken for inline asm.

Can you clarify what you mean here and how this change would fix it?

> Then some specific intrinsic call sites were a problem.

Same, I'm not sure I understand this one.

> Then there are special cases that don't necessarily go through the standard 
> user function codegen paths which don't get the default attributes.



> Some odd places where IR is compiled into the user library that didn't have 
> convergent added.

Bug in the user library? Again I'm not sure why it justifies a change in LLVM.

> The short version is the fact that most developers aren't aware of and don't 
> understand the subtleties of convergence, and making sure the front-end does 
> something in all contexts requires a lot of diligence. It's very easy to 
> introduce these difficult to debug bugs when calls are broken by default.

The fact that most developers aren't aware of convergence means a lot of 
potential bugs in LLVM because passes and transformations won't honor the 
convergent semantics, but the default for the attribute won't change that.
As of the frontend, it seems to me that this is just about structuring the 
code-path for function emission to define the right set of default attribute. 
It isn't clear to me why a refactoring of the frontend isn't a better course of 
actions here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D69498#1725819 , @mehdi_amini wrote:

> Maybe we can start by looking into the motivation for this patch:
>
> > There is a burden on frontends in environments that care about convergent 
> > operations to add the attribute just in case it is needed. This has proven 
> > to be somewhat problematic, and different frontend projects have 
> > consistently run into problems related to this.
>
> Can you clarify what kind of problems? Do you have concrete examples?


Basically no frontend has gotten this right, including clang and non-clang 
frontends. There's a gradual discovery process that it's necessary in the first 
place, and then a long tail of contexts where it's discovered it's missing. As 
I mentioned before, SYCL and OpenMP are both still broken in this regard. For 
example in clang, convergent wasn't initially added to all functions, and then 
it was discovered that transforms on indirect callers violated it. Then later 
it was discovered this was broken for inline asm. Then some specific intrinsic 
call sites were a problem. Then there are special cases that don't necessarily 
go through the standard user function codegen paths which don't get the default 
attributes. Some odd places where IR is compiled into the user library that 
didn't have convergent added. The discovery of convergent violations is 
painful; more painful than discovering that control flow didn't optimize the 
way you expected. It requires eternal vigilance and all developers to be aware 
of it.

The short version is the fact that most developers aren't aware of and don't 
understand the subtleties of convergence, and making sure the front-end does 
something in all contexts requires a lot of diligence. It's very easy to 
introduce these difficult to introduce bugs when calls are broken by default.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-29 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment.

In D69498#1725819 , @mehdi_amini wrote:

> Maybe we can start by looking into the motivation for this patch:
>
> > There is a burden on frontends in environments that care about convergent 
> > operations to add the attribute just in case it is needed. This has proven 
> > to be somewhat problematic, and different frontend projects have 
> > consistently run into problems related to this.
>
> Can you clarify what kind of problems? Do you have concrete examples?


The frontend has to put convergent on everything that is not known to be 
non-convergent right now. That is, the front-end and all function generating 
constructs need to know about convergent to preserve correctness. Dropping 
convergent is also not sound. Frontends in the past put convergent only on 
barriers or similar calls but not on all functions that might transitively call 
barriers, leading to subtle bugs.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-29 Thread Mehdi AMINI via Phabricator via cfe-commits
mehdi_amini added a comment.

Maybe we can start by looking into the motivation for this patch:

> There is a burden on frontends in environments that care about convergent 
> operations to add the attribute just in case it is needed. This has proven to 
> be somewhat problematic, and different frontend projects have consistently 
> run into problems related to this.

Can you clarify what kind of problems? Do you have concrete examples?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-29 Thread Marcello Maggioni via Phabricator via cfe-commits
kariddi added a comment.

In D69498#1725528 , @rjmccall wrote:

> It absolutely makes sense for Clang as a GPU-programming frontend to set 
> attributes appropriately when targeting the GPU.  I'm objecting to making 
> "convergence" and related "code layout is semantics" properties a universal 
> part of the IR semantics that literally every frontend has to know about in 
> order to get reasonable behavior from LLVM.  I know that doing so makes sense 
> to GPU backend developers because you mostly work exclusively on GPU 
> toolchains, but AFAIK there are half a dozen GPU frontends and they're all 
> forks of Clang, whereas there are dozens of LLVM frontends out there that 
> don't care about targeting the GPU and quite possibly never will.  (And even 
> if they do target GPUs, they often will not care about exposing thread 
> groups; many programming environments are just interested in taking advantage 
> of the GPU's parallel-computation power and have no interest in inter-thread 
> interference.)
>
> John.


I agree that the issue with making it "transparent" as a concept to whoever is 
not interested in programming models that have the concept of SIMD-wide 
execution is an open issue (not only related to this patch, but in general to 
the convergent idea as a whole, where writing a llvm pass that maintains 
convergence now is an extra burden to the developer of such pass that wasn't 
there before and that is probably completely orthogonal to the interest of the 
pass writer probably targeting C/C++ or other non-parallel languages). I opened 
some discussions going on the other related RFC for extending the concept of 
convergent to avoid hoisting as well regarding how are we gonna avoid burdening 
the LLVM community and maintain the invariants we want with respect to this 
concept.
I have no idea what the impact of the convergent attribute is in Clang (with 
the exception of adding the flag itself), but a lot of the heavy-lifting I know 
its in LVLM itself.

That said I just want to point out that some of these languages run on CPUs as 
well (like openCL ) and they share the same properties with respect to 
instructions that read execution masks of neighboring threads and making sure 
threads stay converged when executing them. It's certainly unfortunate that 
LLVM has some deficiencies in supporting these concepts and that the so far 
proposed solutions are not burden free for the rest of the community. :-/


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-29 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

In D69498#1724625 , @kariddi wrote:

> One thing to probably note is that its not only a "target specific" issue, 
> but a language specific issue as well (IMHO). OpenCL, CUDA, SYCL are all 
> languages (to name a few) that have a concept of "convergence" and its not 
> related only to the fact that they mostly run on a GPU, but to their 
> programming model as well with respect to accessing textures and SIMD-wide 
> communication and decision making (ballot)
>
> Considering that Clang has support for these languages it kind of makes sense 
> to me that the frontend is taking care of implementing the supported 
> languages correctly by applying the proper attributes.


It absolutely makes sense for Clang as a GPU-programming frontend to set 
attributes appropriately when targeting the GPU.  I'm objecting to making 
"convergence" and related "code layout is semantics" properties a universal 
part of the IR semantics that literally every frontend has to know about in 
order to get reasonable behavior from LLVM.  I know that doing so makes sense 
to GPU backend developers because you mostly work exclusively on GPU 
toolchains, but AFAIK there are half a dozen GPU frontends and they're all 
forks of Clang, whereas there are dozens of LLVM frontends out there that don't 
care about targeting the GPU and quite possibly never will.  (And even if they 
do target GPUs, they often will not care about exposing thread groups; many 
programming environments are just interested in taking advantage of the GPU's 
parallel-computation power and have no interest in inter-thread interference.)

John.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-29 Thread Nicolai Hähnle via Phabricator via cfe-commits
nhaehnle added a comment.

As you know, I am very much in favor of this change, and really anything that 
re-establishes the rule that code is treated maximally conservatively when 
there are no attributes or metadata.

There is one slight concern here because what we arguably need in SIMT targets 
is two attributes, which I'm going to call `allowconvergence` and 
`allowdivergence` for now. Convergent operations are operations that 
communicate with some set of other threads that is implicitly affected by where 
the operation is in control flow. If we have as a baseline that this set of 
threads must not be changed at all by transforms, then there are two directions 
in which this can be relaxed:

- It may be correct to transform code in a way that may cause **more** threads 
to communicate. This applies e.g. to OpenCL uniform barriers and could be 
expressed as `allowconvergence`.
- It may be correct to transform code in a way that may cause **fewer** threads 
to communicate. This applies to readanylane-style operations (which are 
implicit e.g. in some of our buffer and image intrinsics) and could be 
expressed as `allowdivergence`.

Though non-SIMT targets may not want to add two attributes everywhere. Perhaps 
a viable path forward is to take this change towards `noconvergent` now if we 
can still add `allowconvergence` and `allowdivergence` later; `noconvergent` 
would then simply be the union of those two attributes in the end.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread Marcello Maggioni via Phabricator via cfe-commits
kariddi added a comment.

One thing to probably note is that its not only a "target specific" issue, but 
a language specific issue as well (IMHO). OpenCL, CUDA, SYCL are all languages 
(to name a few) that have a concept of "convergence" and its not related only 
to the fact that they mostly run on a GPU, but to their programming model as 
well with respect to accessing textures and SIMD-wide communication and 
decision making (ballot)

Considering that Clang has support for these languages it kind of makes sense 
to me that the frontend is taking care of implementing the supported languages 
correctly by applying the proper attributes.

Now, its debatable that "convergent" is enough to cover all the cases we need 
... and CUDA for example is running away from "ballot()" and such and moving to 
versions that have an explicit mask of the threads involved in the transaction 
for that reason. Even the new version with the "dynamic instances" of the 
mentioned proposal might not be enough ... but that's discussion for the other 
RFC ...


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

I think the question should be what the IR policy is for properties that are 
required for correctness and not necessarily what most users will use. There's 
a general trend towards functions being correct by default, and attributes 
adding optimization possibilities. convergent (and the broken noduplicate) are 
exceptional by inverting this. I think strictfp is also wrong in this sense, 
and should possibly also be fixed.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment.

@rjmccall Thanks for the quick response! I tried to describe my view below. To 
avoid confusion on my side, could you maybe describe what you think when/which 
attribute should be created, derived, and used during the optimization pipeline?

In D69498#1724232 , @rjmccall wrote:

> The kind of tracking that you're doing of convergent operations is one 
> example of a basically limitless set of "taint" analyses, of which half a 
> dozen already exist in LLVM.


Yes, there are a lot, and to make things worse I added another one to rule them 
all. So at least in the middle-term there should be only one "(non)convergent" 
analysis.

> I object to the idea that an arbitrary number of unrelated tests need to be 
> changed every time somebody invents a new taint analysis.

Maybe the problem is that I don't understand why the effect on the test is so 
bad, given that it is the same as any new default attribute has (= very minimal 
and mechanical).
Or maybe I have less of an issue with these changes because I expect other 
attributes to be added as default soon and they will have a similar effect on 
the tests.
(To mention one with a made up name: `forward_process_guarantee` for functions 
that carry this property instead of pretending all do throughout the middle-end 
and hoping the front-ends deal with it accordingly.)

> I also object to the idea that frontends have a responsibility to proactively 
> compensate for every new weird target-specific representation change just to 
> get the optimizer to perform standard transformations that fully respect the 
> documented semantics of LLVM IR.

I generally agree and this has to be announced properly, but I don't think will 
require "much change" (there are default attribute sets already I suppose). 
It is also, arguably, in the interest of many/all frontends to be able to 
produce correct (GPU) parallel code by minimizing the chance of accidental 
errors.
Finally, we basically have that situation you object to already, every time we 
fix an "error" in the IR semantics which allowed a transformation in the 
general case (forward process comes to mind again, dereferenceable_globally as 
well D61652 , ...).

In D69498#1724245 , @rjmccall wrote:

> If you can come up with a way to make this change that doesn't require 
> changes to non-GPU frontends or tests, I agree that treating functions as 
> convergent by default makes sense for GPU targets.


Do you object `noconvergent` as an alternative to `convergent`?
If not, the one "problem" that pops up with having it not as a default 
attribute is that the middle-end would then need to check the target in 
addition to the `noconvergent` attribute, right?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

If you can come up with a way to make this change that doesn't require changes 
to non-GPU frontends or tests, I agree that treating functions as convergent by 
default makes sense for GPU targets.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

The kind of tracking that you're doing of convergent operations is one example 
of a basically limitless set of "taint" analyses, of which half a dozen already 
exist in LLVM.  I object to the idea that an arbitrary number of unrelated 
tests need to be changed every time somebody invents a new taint analysis.  I 
also object to the idea that frontends have a responsibility to proactively 
compensate for every new weird target-specific representation change just to 
get the optimizer to perform standard transformations that fully respect the 
documented semantics of LLVM IR.  The basic problem at the root of convergence 
analysis — that code layout is essentially part of GPU program semantics 
because of its impact on GPU thread groups — is in fact a very special property 
of GPU targets that is not in any way unreasonable to ask those targets to be 
explicit about.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a reviewer: JonChesterfield.
jdoerfert added a comment.

Before we get lost in review details let's make sure we agree that his is the 
right path. I mean, transition from `convergent` to `noconvergent` (or a 
similar spelling).
I do support this by the way.

In D69498#1723606 , @rjmccall wrote:

> [...] Perhaps there should be a global metadata, or something in the 
> increasingly-misnamed "data layout" string, which says that convergence is 
> meaningful, and we should only add the attribute in appropriately-annotated 
> modules?


I see what you mean but I think it would be "a hack" to teach the lookup 
methods, e.g., `isNoConvergent`, to also look into the datalayout (or any other 
place). Instead, front-ends should emit `noconvergent` as one of their default 
arguments if that is the behavior they want. This should be essentially free 
after they update their code. Neither idea will fix the "old IR" problem anyway 
but we could make the deduction add `noconvergent` to all functions that are 
not `convergent` if `convergent` was found in the module.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm marked an inline comment as done.
arsenm added a comment.

In D69498#1723606 , @rjmccall wrote:

> A note on spelling: the no prefix seems to be used largely with verbs; it's 
> weird to use it here with an adjective, especially since noncovergent is just 
> a letter away.


I don't think it reads any different to me than, say nofree->no free calls. 
noconvergent-> no convergent operations. nonconvergent sounds odder to me

> In D69498#1723553 , @rjmccall wrote:
> 
>> This certainly seems like a more tractable representation, although I 
>> suppose it'd be a thorn in the side of (IR-level) outlining.
> 
> 
> Sorry, I mis-clicked and sent this review while I was still editing it.
> 
> I agree that this is a theoretically better representation for targets that 
> care about convergence, since it biases compilation towards the more 
> conservative assumption.  On the other hand, since the default language mode 
> is apparently to assume non-convergence of user functions, and since the vast 
> majority of targets do not include convergence as a meaningful concept, 
> you're also just gratuitously breaking a ton of existing test cases, as well 
> as adding compile time for manipulating this attribute.  Perhaps there should 
> be a global metadata, or something in the increasingly-misnamed "data layout" 
> string, which says that convergence is meaningful, and we should only add the 
> attribute in appropriately-annotated modules?

The default language mode for the non-parallel languages. The set of languages 
not setting this does need to grow (SYCL and OpenMP device code are both 
currently broken).

I don't think there are other attributes with weird connections to some global 
construct. Part of the point of attributes is that they are function level 
context. I don't really understand the compile time concern; the cost of the 
attribute is one bit in a bitset. We have quite a few attributes already that 
will be inferred for a large percentage of functions anyway. The case that's 
really important to add in frontend is external declarations and other opaque 
cases, others will be inferred anyway. You could make the same argument with 
nounwind, since most languages don't have exceptions.




Comment at: clang/lib/CodeGen/CGObjCMac.cpp:4259
   call->setDoesNotThrow();
+  call->setNoConvergent();
   call->setCallingConv(CGF.getRuntimeCC());

rjmccall wrote:
> I don't think GPU thread convergence is a concept that could ever really be 
> applied to a program containing ObjC exceptions, so while this seems correct, 
> it also seems pointless.
It doesn't apply, so this needs to be explicitly marked as not having it. This 
bypasses the place that emits the language assumed default attribute so this is 
needed. A handful of tests do fail without this from control flow changes not 
happening


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

In D69498#1723553 , @rjmccall wrote:

> This certainly seems like a more tractable representation, although I suppose 
> it'd be a thorn in the side of (IR-level) outlining.


Sorry, I mis-clicked and sent this review while I was still editing it.

I agree that this is a theoretically better representation for targets that 
care about convergence, since it biases compilation towards the more 
conservative assumption.  On the other hand, since the default language mode is 
apparently to assume non-convergence of user functions, and since the vast 
majority of targets do not include convergence as a meaningful concept, you're 
also just gratuitously breaking a ton of existing test cases, as well as adding 
compile time for manipulating this attribute.  Perhaps there should be a global 
metadata, or something in the increasingly-misnamed "data layout" string, which 
says that convergence is meaningful, and we should only add the attribute in 
appropriately-annotated modules?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69498: IR: Invert convergent attribute handling

2019-10-28 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

This certainly seems like a more tractable representation, although I suppose 
it'd be a thorn in the side of (IR-level) outlining.

A note on spelling: the `no` prefix seems to be used largely with verbs; it's 
weird to use it here with an adjective, especially since `noncovergent` is just 
a letter away.




Comment at: clang/lib/CodeGen/CGObjCMac.cpp:4259
   call->setDoesNotThrow();
+  call->setNoConvergent();
   call->setCallingConv(CGF.getRuntimeCC());

I don't think GPU thread convergence is a concept that could ever really be 
applied to a program containing ObjC exceptions, so while this seems correct, 
it also seems pointless.



Comment at: clang/lib/CodeGen/CGStmt.cpp:1949
 Result.addAttribute(llvm::AttributeList::FunctionIndex,
-llvm::Attribute::Convergent);
+llvm::Attribute::NoConvergent);
   // Extract all of the register value results from the asm.

The comment here seems to no longer match the code.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69498/new/

https://reviews.llvm.org/D69498



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits