[PATCH] D86559: [Sema, CodeGen] Allow [[likely]] and [[unlikely]] on labels

Aaron Ballman via Phabricator via cfe-commits Thu, 27 Aug 2020 11:16:31 -0700

aaron.ballman added a comment.

In D86559#2242586 <https://reviews.llvm.org/D86559#2242586>, @Mordante wrote:


>> That is exactly the behavior I am coming to believe we should follow. You 
>> can either write it on a compound statement that is guarded by a flow 
>> control decision (`if`/`else`/`while`) or you can write it on a label, 
>> otherwise the attribute is parsed and ignored (with a diagnostic). This 
>> feels like the right mixture of useful with understandable semantics so far, 
>> but perhaps we'll find other examples that change our minds.
>>
>> The fallthrough behavior question has one more case we may want to think 
>> about:
>>
>>   switch (x) {
>>   case 0:
>>   case 1:
>>   [[likely]] case 2: break;
>>   [[unlikely]] default:
>>   }
>>
>> Does this mark cases `0` and `1` as being likely or not? I could see users 
>> wanting to use shorthand rather than repeat themselves on all the cases. 
>> However, I'm also not certain whether there would be any performance impact 
>> if we marked only `case 2` as likely and left cases `0` and `1` with default 
>> likelihood. My gut feeling is that this should only mark `case 2`, but 
>> others may have different views.
>
> Not according to the standard: "A path of execution includes a label if and 
> only if it contains a jump to that label."

A switch statement contains a jump to all of its labels, though. So all of 
those labels are included in the path of execution, which is not likely what's 
intended by the standard.

> However for an if statement I use 2 weights: 2000 for likely, 1 for unlikely. 
> (These can be configured by a command-line option already used for 
> `__builtin_expect`.)
> For a switch I intent to use 3 weights: 2000 for likely, (2000 + 1)/2 for no 
> likelihood, 1 for unlikely. `__builtin_expect` doesn't have a 'neutral' 
> values since in a switch you can only mark one branch as likely. Instead of 
> adding a new option I picked the midpoint.
> So the weights in your example would be:
>
>   > switch (x) {
>   case 0:  // 1000
>   case 1:  // 1000
>   [[likely]] case 2: break; // 2000
>   [[unlikely]] default: // 1
>    }

I think this is defensible.

> In this case the user could also write:
>
>   > switch (x) {
>   case 0:  // 1000
>   case 1:  // 1000
>   case 2: break; // 1000
>   [[unlikely]] default: // 1
>    }
>
> where the values 0, 1, 2 still would be more likely than the default 
> statement. Obviously this will be documented when I implement the switch.
> How do you feel about this approach?

I think it's a reasonable approach; at least, I can't think of a case where 
this falls over.

>>> I fully agree the behaviour mandated by the standard is way too complex and 
>>> user unfriendly. It would have been nice if there were simpler rules, 
>>> making it easier to use and to teach. Still I think it would be best to use 
>>> the complex approach now, since that's what the standard specifies. During 
>>> that process we can see whether there are more pitfalls. Then we can 
>>> discuss it with other vendors and see whether we can change the wording of 
>>> the standard. Do you agree?
>>
>> The only requirement from the standard is that we parse `[[likely]]` or 
>> `[[unlikely]]` on a statement or label, that we don't allow conflicting 
>> attributes in the same attribute sequence, and that the attribute has no 
>> arguments to it. All the rest is recommended practice that's left up to the 
>> implementation as a matter of QoI. So we conform to the standard by 
>> following the approach on compound statements and labels + the constraint 
>> checking. We may not be following the recommended practice precisely, but we 
>> may not *want* to follow the recommended practice because it's bad for 
>> usability. So I'd like us to design the feature that we think gives the most 
>> value and is the easiest to use that matches the recommended practice as 
>> best we can, then start talking to other vendors. If other vendors don't 
>> want to follow our design, that's totally fine, but we should ensure our 
>> design *allows* users to write code that will behave similarly for other 
>> implementations without diagnostics. e.g., we don't want to design something 
>> where the user has to use macros to avoid diagnostics in cross-compiler 
>> code. After that, we may want to propose some changes to the recommended 
>> practice to WG21.
>>
>> From my current thinking, it seems that there may be agreement that allowing 
>> these on arbitrary statements may be really difficult for users to 
>> understand the behavior of and that compound statements and labels are what 
>> we think is an understandable and useful feature and is also a strict subset 
>> of what we COULD support. By that, I think we should limit the feature to 
>> just compound statements and labels; this leaves the door open to allow the 
>> attributes on arbitrary statements in the future without breaking code. If 
>> we allow on arbitrary statements from the start, we can't later restrict the 
>> feature because we'd break code. This lets us start getting field experience 
>> with that behavior to see how we like it, which may also help when talking 
>> to other vendors because we'll have something concrete to talk about 
>> (hopefully). WDYT?
>
> I'm somewhat on the fence. In general I prefer to implement what the standard 
> expects. However the more I try to do that, the more I'm convinced what the 
> standard suggests is difficult to achieve and more importantly to difficult 
> explain to C++ users. In order to implement this feature properly I've been 
> looking at using the CFG. That allows me to follow the path of execution 
> properly. However it makes the feature more complicated. AFAIK the CFG is 
> only used in Clang for `AnalysisBasedWarnings::IssueWarnings`. This feels 
> like a hint, this is not the proper solution.

FWIW, that's where we implement the logic for the `[[fallthrough]]` attribute 
as well, so there's certainly precedent for it. When you consider how tightly 
coupled the feature is to control flow, using the control flow graph doesn't 
seem entirely unreasonable. I think the potential downside will be performance 
concerns (you have to opt in to fallthrough diagnostics, so you don't pay for 
the CFG expense unless you enable it), but those may be negligible depending on 
how you're implementing it.

> Another "interesting" issue with the current approach is, that it'll be hard 
> to determine the likelihood when looking at the code:
>
>   if(a) {
>       foo(a);
>       [[likely]] ++a;
>   }
>
> If `foo` is declared `[[noreturn]]` the `++a` is never executed so the 
> attribute shouldn't affect the likelihood. (This is of course great when 
> creating C++ trivia, but not great for users.)

Hah, what a fun use case! :-D

> Another issue is to properly implement it when a branch has as a simple `if` 
> inside it:
>
>   if(a > 5) {
>       if(a < 10)
>           ++a;
>       [[likely]] ++a;
>   }
>
> Obviously the attribute is in the `a > 5` branch and will always be executed, 
> but I haven't found a solution to figure that out using the CFG. I need to 
> find a way to determine the `a < 10` will always resume the original branch. 
> I'm convinced it can be done, but it made me wonder whether this is the best 
> approach.

You should be able to tell that information from the CFG, I believe, because 
the exit node for both paths should converge. However, "should be able to" and 
"can easily do" are not the same thing; I don't know how much of a challenge it 
would be.

> So I think it would be wise follow your suggestion. Implement a simple 
> version and try to convince other vendors to follow suit. This leaves the 
> door open to implement a more complex solution if needed. I think this 
> approach does C++ users and ourselves a favour. I only feel we need a slight 
> relaxation in the requirements. In your approach the attribute is only 
> allowed on a compound statement. I would like to allow it on the first 
> statement, like it was in my initial patch. This allows using the attribute 
> without being forced to use a compound statement. I feel forcing users to use 
> a certain coding style is a bad idea.
>
>   if(a) [[likely]] { // Good allowed on the compound statement
>     ...
>   }
>   
>   if(a)
>       [[likely]] return; // Good allowed on the first statement

Yes, I agree this makes sense -- the attribute is being written on the 
substatement of a control flow branch (whether it's a compound statement or 
just a regular statement).

> if(a) {
>
>   [[likely]] return; // Ignored, not on the first statement
>
> }

Agreed.

> if(a)                      // Attribute not allowed on a declaration,
>
>   [[likely]] int i = 5;  // but I can't think about a good use-case
>                          // for this code.

This is a fun example because I can think of a somewhat reasonable use-case but 
we can't support it without a compound statement. :-D The declaration could be 
an RAII object,

  if (a)
    [[likely]] SomeRAIIObj Obj(*a);

However, you're right that we cannot write the attribute there because a 
declaration-statement cannot be attributed 
(http://eel.is/c++draft/stmt.stmt#stmt.pre-1), so the attribute instead 
appertains to the declaration and not the statement. So the user would have to 
write:

  if (a) [[likely]] {
    SomeRAIIObj Obj(*a);
  }

That said, I think this is enough of a corner case that requiring the compound 
statement is not really a burden. If we find users run into that often (they'd 
get an attribute ignored error if they tried), we could add a fix-it to help 
them out, but that doesn't seem necessary initially.

> if(a) [[likely]] { // Good allowed on the compound statement
>
>     int i = 5;     // Now i seems to have a purpose
>   ...
>
> }
>
> if(a) [[likely]] {} // A diagnostic is issued and the attributes
> else [[likely]]] {} // are ignored
>
>   The diagnostics for ignored attributes are enabled by default so 
> contradiction is visible by default. For the `switch` I'll use the previous 
> proposal.
>   WDYT?

I think this all seems reasonable to me.

> If C++20 was still in development I would even consider to write a proposal 
> to make the attributes more like `__builtin_expect`:
>
>   [[likely]] if(a) {
>     // Considered to be arbitrarily likely
>   } else {
>     // Considered to be arbitrarily unlikely
>   }
>   
>   if(a) {
>   } [[unlikely]] else { // Not allowed on the else statement
>   }
>   
>   if(a) {
>     // No likelihood suggestion
>   } else
>     // No likelihood suggestion
>   
>     [[unlikely]] if{
>     // Considered to be arbitrarily likely
>   }
>
> But it's too late to consider this change.

Yeah, unfortunately, I think that ship has sailed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86559/new/

https://reviews.llvm.org/D86559

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D86559: [Sema, CodeGen] Allow [[likely]] and [[unlikely]] on labels

Reply via email to