Re: break seen as a C archaism

Brian Goetz Tue, 13 Mar 2018 19:02:14 -0700

I get what you’re looking for, I really do.  Existing switch has a lot of 
warts, some of which could be avoided with a new expression switch construct.  
Avoiding warts seems like a good idea, and fall through is _even wartier_ with 
expression switches than statement switches.  (Not sure how you quantify 
wartiness.  Frogs per KLoC?)


But, the problem is that if we make an expression switch construct, now we have 
two switch constructs.  They’re similar, and you will frequently want to 
refactor between them, but they’re subtly different.  And let’s not assume that 
the new construct will not have warts, so now that’s two different sets of 
warts the user has to keep in their head.  Users who arrive at Java will ask, 
“why are there two subtly different ways to do basically the same thing?”  I 
don’t think that’s necessarily doing anyone a favor.  If the warts were fatal, 
sure, that’s what we’d do, but they’re not.  Mostly, it’s annoying that we’re 
stuck with them for another 20 years. 

While of course I may be too close to it, I think there’s a very nice balance 
in the current proposal.  In addition to control flow working exactly the same 
across the two forms, which I think does reduce the complexity of the language, 
the shorthand for “case X -> e” protects users from the most warty aspects 
almost all the time: 
 - the at-least-at-first weirdness of “break value”;
 - the need to say break at all;
 - the risk of accidental fall through.

But if you need either explicit break or fall-through, they’re there, and they 
work just like the break you’ve always known, for better or worse.  Just 
“break” the glass.  

You’re saying “it wouldn’t be that bad if switch expressions had no fall 
through and only could take single expressions on the RHS, no statements.”  
Well, if you restrict yourself to the -> syntactic form, you get exactly that!  
So the only difference is that the escape hatch exists, if you’re willing to 
use it.  But if you’re not willing to use it, they you get exactly what you 
asked for. 


> On Mar 13, 2018, at 8:43 PM, Kevin Bourrillion <kev...@google.com> wrote:
> 
> Sorry for 5,000 inline replies.
> 
> 
> On Tue, Mar 13, 2018 at 1:18 PM Brian Goetz <brian.go...@oracle.com 
> <mailto:brian.go...@oracle.com>> wrote:
> 
> Thanks for the detailed analysis.  I'm glad to see that a larger percentage 
> could be converted to expression switch given sufficient effort;
> 
> Just to be clear: the sample was only from the 19% already identified as 
> convertible, so in this discussion that number is only going down. However, 
> the important point is that we should separately investigate the 81% to see 
> whether a few simple heuristics make them recognizable as e-switches too.
> 
>  
> I'm also not surprised to see that a lot had accidental reasons that kept 
> them off the happy path.  And, your analysis comports with expectations in 
> another way: that some required more significant intervention than others to 
> lever them into compliance.  For example, the ones with side-effecting 
> activities or logging, while good candidates for "strongly dissuade", happen 
> in the real world, and not all codebases have the level of discipline or 
> willingness to refactor that yours does.  So I'm not sure I'm ready to toss 
> either of the size-7 sub-buckets aside so quick; not everyone is as sanguine 
> as "well, go refactor then" as you are.
> 
> Okay, I want to clarify a few things about this style of research, which is 
> how we have been evaluating API decisions for Guava and our other libraries 
> for a long time, but which I skipped really explaining.
> 
> First, we are using "existing code could refactor to" as a proxy for "what 
> might they probably write anew today if they could". It's a conceit, but a 
> useful one since the corpus of existing code has the benefit of being visible 
> and analyzable. :-) So, arguments that we make don't necessarily rest on 
> whether actual users will actually refactor. (Also, I mean, if they aren't 
> willing to refactor at all, then they wouldn't be changing to expression 
> switch at all so they'd be irrelevant to us anyway, but this is not the real 
> point.)
> 
> Second, we believe that the most useful (not only) way to judge the value of 
> a feature lies in comparing the best code users can write without that 
> feature to the best code they can write with it. Then we look at two factors: 
> (a) how much better did the feature make this code, and (b) how commonly do 
> we think this case comes up in real life. Roughly speaking we multiply (a) 
> and (b) together (for which I have with questionable taste attempted to coin 
> the phrase "utility times ubiquity") and that gives about how much we care 
> about making the change. This type of analysis has driven most decisions 
> about the shape of Guava's API.
> 
> It's not the only kind of argument that can be made. One can also add "it 
> might not make a large amount of 'great' code 'greater', but people who write 
> mediocre code will be more likely to write decent code!" I just think that we 
> should put less stock in such arguments than the other kind, because this 
> sounds like a problem better addressed with static analysis tools, education, 
> evangelism, and whatnot.
> 
>  
> Which is to say, adjusting the data brings the simple bucket from 86% (which 
> seemed low) to 93-95%, which is "most of the time" but not "all of the time". 
>  So most of the time, the code will not need whatever escape mechanism (break 
> or block), but it will often enough that having some escape hatch is needed.  
> 
> We think the ability to stick with procedural switch is that escape hatch 
> already.
> 
> 
> You didn't mention fallthrough, but because that's everyone's favorite 
> punching bag here, I'll add: most of the time, fallthrough is not needed in 
> expression switches, but having reviewed a number of low-level switches in 
> the JDK, it is sometimes desirable even for switches that can be converted to 
> expressions.  One of the motivations for refining break in this way is so 
> that when fallthrough is needed, the existing idiom just works as everyone 
> understands it to.
> 
> My assumption is that that code can just keep doing what it's already doing. 
> My claim is that there is only value to changing to expression switch if we 
> are getting the benefit of how much more simple and constrained it is.
> 
>> imho, early signs suggest that the grossness of `break x` is not nearly 
>> justified by the actual observed positive value of supporting 
>> multi-statement cases in expression switch. Are we open to killing that, or 
>> would we be if I produced more and clearer evidence?
> 
> That's one valid interpretation of the data, but not the only.  Whether 
> making break behave more like return (takes a value in non-void context, 
> doesn't take one in void context) is gross or natural is subjective.  Here's 
> my subjective interpretation: "great, most of the time, you don't have to use 
> the escape hatch, but when you do, it has control flow just like the break we 
> know, extended to take a value in non-void contexts, so will be fairly 
> familiar."
> 
> I think that there are features that make sense on their own, and there are 
> features that totally make lots of sense assuming that you have heard the 
> expert group's passionate explanation of why they make sense. (It reminds me 
> of a certain Pied Piper focus group near the end of Silicon Valley season 2, 
> but moving on.) I am concerned that "breaking a value" is of the second kind.
> 
> 
> But setting aside subjective reactions, are there better alternatives?  Let's 
> review what has been considered already, and why they've been passed over: 
> 
>  - Do nothing; only allow single expressions.  Non-starter.
> 
> We're just saying the feature seems to be at least 90% as applicable without 
> it. Roughly. Why is it a non-starter for the other 10% to stick with the 
> switch they've always had? I'm sure there are good answers to that, I'm not 
> doubting there are, but I think we should explore them instead of just 
> declaring something a non-starter by fiat.
> 
> 
>     case 1 -> 1;
>     case 2 -> { println("two"); break 2; }
> 
> But that is pretty similar to what we have now, just with braces.  If the 
> concern is that we're stretching `break` too far, then this is just as bad.  
> 
> Worse, it has two significant additional downsides: 
> 
> 1.  You can't fall through at all.  (Yes, I know some people think this is an 
> upside.)
> 
> Yes! That! That's what we want. No fall-through.
> 
>  
> But real code does use fallthrough, and this leaves them without any 
> alternative; it also widens the asymmetry of expression switch vs statement 
> switch.
> 
> Well, the other thread I started today is me literally asking for asymmetry 
> between this and statement switch. If we stop using the `switch` keyword, so 
> much the better.
> 
> What are the motivating use cases for fall-through in expression switch? 
> These must be exclusively examples featuring side-effects, right? Or is there 
> a way for a case to access the result produced by the previous one and build 
> on it?
> 
> 
>   (Combine this with other suggestions that widen the asymmetry between 
> pattern and non-pattern switch, and you have four switch constructs.  Oops.)  
> 
> (Not familiar with that stuff yet.)
> 
>  
> There might be other alternatives, but I don't see a better one, other than 
> deprecating switch and designing a whole new mechanism.
> 
> I'm confused. `switch` has worked the same way for 20+ years; what could 
> possibly motivate us to deprecate it?
> 
>  
> And, to defend what we've proposed: it's exactly the switch we all know, 
> warts and all.  Very little new; very little in the way of asymmetry between 
> void/value and pattern/constant.
> 
> (My response to this is already teed up in the other thread. Basically, it 
> says that if we don't make expression switch suitably constrained then I have 
> so far failed to grasp what its value is at all.)
> 
>  
>   The cost is that we have to accept the existing warts, primarily the weird 
> block expression (blocks of statements with break not surrounded by braces), 
> the weird scoping, and fallthrough.  
> 
> This choice reminds me of the old Yiddish proverb of the Tree of Sorrows.  
> (https://www.inspirationalstories.com/0/69.html 
> <https://www.inspirationalstories.com/0/69.html>).  
> 
> Tangent, but I think that story actually advocates that we stick with exactly 
> the switch statement we already have today.
> 
> 
> 
> If you've got something better ...  
> 
> 
> 
> On 3/13/2018 3:32 PM, Kevin Bourrillion wrote:
>> On Fri, Mar 9, 2018 at 3:21 PM, Louis Wasserman <lowas...@google.com 
>> <mailto:lowas...@google.com>> wrote:
>> 
>> Simplifying: let's call normal cases in a switch simple if they're a single 
>> statement or a no-op fallthrough, and let's call a default simple if it's a 
>> single statement or it's not there at all.
>> 
>> Among switches apparently convertible to expression switches,
>> 81% had all simple normal cases and a simple default.
>> 5% had all simple normal cases and a nonsimple default.
>> 12% had a nonsimple normal case and a simple default.
>> 2% had a nonsimple normal case and a nonsimple default.
>> I was surprised it was as high as 19%, so I grabbed a random sample of these 
>> 45 occurrences from Google's codebase and reviewed them. My goal was to find 
>> evidence that multi-statement cases in expression switches are important and 
>> common. Spoiler: I found said evidence underwhelming.
>> 
>> There were 3 that I would call false matches (e.g. two that simply used a 
>> void `return` instead of `break` after every case without reason).
>> 
>> There were fully 20 out of the remaining 42 that I quickly concluded should 
>> be refactored regardless of anything else, and where that refactoring 
>> happens to leave them with only simple cases and simple/no default. These 
>> refactorings were varied (hoist out code common to all non-exception cases; 
>> simplify unreachable code; change to `if` if only 1-2 cases; extract a 
>> method (needing only 1-2 parameters) for a case that is much bigger than the 
>> others; switch from loop to Streams; change `if/else` to ?:; move a 
>> precondition check to a more appropriate location; and a few other varied 
>> cleanups).
>> 
>> Next there were 7 examples where the non-simple cases included 
>> side-effecting code, like setting fields or calling void methods. In Google 
>> Style I expect that we will probably forbid (or at least strongly dissuade) 
>> side effects in expression switch. I should probably bring this up 
>> separately, but I am pretty convinced by now that users should see 
>> expression switch and procedural switch as two completely different things, 
>> and by convention should always keep the former purely functional.
>> 
>> Next there were 7 examples where a case was "non-simple" only because it was 
>> using the "log, then return a null object (or null), instead of throwing an 
>> exception" anti-pattern. I was surprised this was that popular. and another 
>> 2 that used the "log-and-also-throw" anti-pattern.
>> 
>> 2 examples had a use-once local variable that saved a little bit of nesting. 
>> I wouldn't normally refactor these, but if expression switch had no 
>> mechanism for multi-statement cases, I wouldn't think twice about it.
>> 
>> 1 example had cases that looked nearly identical, 3 statements each, that 
>> could all be hoisted out of the switch, except that the types that differed 
>> across the three didn't implement a common interface (as they clearly should 
>> have). Slightly compelling.
>> 
>> 1 example had all simple cases except that one also wanted to check an 
>> assertion. Okay, slightly compelling.
>> 
>> Finally, the cases that were the most compelling to me: 3 examples had one 
>> or more large cases, where factoring them out into helper methods would imho 
>> be ugly because >=3 parameters would be required. If expression switch 
>> didn't permit multi-statement cases, I would just keep them as procedural 
>> switches. It's only 3 out of 42.
>> 
>> Summary:
>> 
>> imho, early signs suggest that the grossness of `break x` is not nearly 
>> justified by the actual observed positive value of supporting 
>> multi-statement cases in expression switch. Are we open to killing that, or 
>> would we be if I produced more and clearer evidence?
>> 
>> 
>> 
>> 
>> 
>> 
>> On Fri, Mar 9, 2018 at 2:56 PM Brian Goetz <brian.go...@oracle.com 
>> <mailto:brian.go...@oracle.com>> wrote:
>> Did you happen to calculate what percentage was _not_ the "default" case?  I 
>> would expect that to be a considerable fraction.  
>> 
>> On 3/9/2018 5:49 PM, Kevin Bourrillion wrote:
>>> On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax <fo...@univ-mlv.fr 
>>> <mailto:fo...@univ-mlv.fr>> wrote:
>>> 
>>> When i asked what we should do instead, the answer is either:
>>>   1/ we should not allow block of codes in the expression switch but only 
>>> expression
>>>   2/ that we should use the lambda syntax with return, even if the 
>>> semantics is different from the lambda semantics.
>>> 
>>> I do not like (1) because i think the expression switch will become useless
>>> 
>>> In our (large) codebase, +Louis determined that, among switch statements 
>>> that appear translatable to expression switch, 13.8% of them seem to 
>>> require at least one multi-statement case.
>>> 
>> 
>> 
>> 
>> 
>> -- 
>> Kevin Bourrillion | Java Librarian | Google, Inc. | kev...@google.com 
>> <mailto:kev...@google.com>
> 
> 
> -- 
> Kevin Bourrillion | Java Librarian | Google, Inc. | kev...@google.com 
> <mailto:kev...@google.com>

Re: break seen as a C archaism

Reply via email to