Re: break seen as a C archaism

Brian Goetz Tue, 13 Mar 2018 13:19:49 -0700

Thanks for the detailed analysis. I'm glad to see that a largerpercentage could be converted to expression switch given sufficienteffort; I'm also not surprised to see that a lot had accidental reasonsthat kept them off the happy path. And, your analysis comports withexpectations in another way: that some required more significantintervention than others to lever them into compliance. For example,the ones with side-effecting activities or logging, while goodcandidates for "strongly dissuade", happen in the real world, and notall codebases have the level of discipline or willingness to refactorthat yours does. So I'm not sure I'm ready to toss either of the size-7sub-buckets aside so quick; not everyone is as sanguine as "well, gorefactor then" as you are.

Which is to say, adjusting the data brings the simple bucket from 86%(which seemed low) to 93-95%, which is "most of the time" but not "allof the time". So most of the time, the code will not need whateverescape mechanism (break or block), but it will often enough that havingsome escape hatch is needed.

You didn't mention fallthrough, but because that's everyone's favoritepunching bag here, I'll add: most of the time, fallthrough is not neededin expression switches, but having reviewed a number of low-levelswitches in the JDK, it is sometimes desirable even for switches thatcan be converted to expressions. One of the motivations for refiningbreak in this way is so that when fallthrough is needed, the existingidiom just works as everyone understands it to.

imho, early signs suggest that the grossness of `break x` is not/nearly/ justified by the actual observed positive value of supportingmulti-statement cases in expression switch. Are we open to killingthat, or would we be if I produced more and clearer evidence?

That's one valid interpretation of the data, but not the only. Whethermaking break behave more like return (takes a value in non-void context,doesn't take one in void context) is gross or natural is subjective. Here's my subjective interpretation: "great, most of the time, you don'thave to use the escape hatch, but when you do, it has control flow justlike the break we know, extended to take a value in non-void contexts,so will be fairly familiar."

But setting aside subjective reactions, are there better alternatives? Let's review what has been considered already, and why they've beenpassed over:


 - Do nothing; only allow single expressions.  Non-starter.

- Traditional "block expressions"; { S; S; e }. Terrible fit forJava, so no. - Some other form of block expression. Seems a very big hammer for asmall problem, which will surely interact with other features, and willlikely call for follow-ons of its own.

 - Some sort of bespoke "block expression for switch".

On the latter, one obvious choice is something lambda-like:

    case 1 -> 1;
    case 2 -> { println("two"); return 2; }

You might argue that this is familiar because it's using `return` justlike lambda, but ... yuck. Lambdas are their own invocation scope, so`return` can be twisted into making sense, but the block of a switch isnot, so `return` is definitely the wrong word here. (Arguably it wasthe wrong word for lambdas too; had someone suggested `break` at theright time back then I would probably have been pretty compelled by thissuggestion, but we picked `return` early (when we were still caught upin "lambdas are sugar for inner classes") and didn't look back. Ohwell.) But it really seems like a bridge too far to use `return` here.The obvious alternative, then, is ... break:


    case 1 -> 1;
    case 2 -> { println("two"); break 2; }

But that is pretty similar to what we have now, just with braces. Ifthe concern is that we're stretching `break` too far, then this is justas bad.


Worse, it has two significant additional downsides:

1. You can't fall through at all. (Yes, I know some people think thisis an upside.) But real code does use fallthrough, and this leaves themwithout any alternative; it also widens the asymmetry of expressionswitch vs statement switch. (Combine this with other suggestions thatwiden the asymmetry between pattern and non-pattern switch, and you havefour switch constructs. Oops.)

2. Either you can only use these block expressions in switch, in whichcase people hate us for one reason, or you can use them everywhere, andthey hate us for another. (I have a hard time imagining that thisdoesn't run into conflicts with other contexts in which one could usebreak (how could it not), plus, I don't think this is the blockexpression idiom I want in the language anyway.)


So it seems like a half-measure that is worse on nearly every metric.

There might be other alternatives, but I don't see a better one, otherthan deprecating switch and designing a whole new mechanism. Which,while I understand the attraction of, I don't think that's doing theusers a favor either.

And, to defend what we've proposed: it's exactly the switch we all know,warts and all. Very little new; very little in the way of asymmetrybetween void/value and pattern/constant. The cost is that we have toaccept the existing warts, primarily the weird block expression (blocksof statements with break not surrounded by braces), the weird scoping,and fallthrough.

This choice reminds me of the old Yiddish proverb of the Tree ofSorrows. (https://www.inspirationalstories.com/0/69.html).


If you've got something better ...



On 3/13/2018 3:32 PM, Kevin Bourrillion wrote:

On Fri, Mar 9, 2018 at 3:21 PM, Louis Wasserman <lowas...@google.com<mailto:lowas...@google.com>> wrote:
    Simplifying: let's call normal cases in a switch simple if they're
    a single statement or a no-op fallthrough, and let's call a
    default simple if it's a single statement or it's not there at all.

    Among switches apparently convertible to expression switches,

      * 81% had all simple normal cases and a simple default.
      * 5% had all simple normal cases and a nonsimple default.
      * 12% had a nonsimple normal case and a simple default.
      * 2% had a nonsimple normal case and a nonsimple default.
I was surprised it was as high as 19%, so I grabbed a random sample ofthese 45 occurrences from Google's codebase and reviewed them. My goalwas to find evidence that multi-statement cases in expression switchesare important and common. Spoiler: I found said evidence underwhelming.
There were 3 that I would call false matches (e.g. two that simplyused a void `return` instead of `break` after every case without reason).
There were fully 20 out of the remaining 42 that I quickly concludedshould be refactored regardless of anything else, and where thatrefactoring happens to leave them with only simple cases and simple/nodefault. These refactorings were varied (hoist out code common to allnon-exception cases; simplify unreachable code; change to `if` if only1-2 cases; extract a method (needing only 1-2 parameters) for a casethat is much bigger than the others; switch from loop to Streams;change `if/else` to ?:; move a precondition check to a moreappropriate location; and a few other varied cleanups).
Next there were 7 examples where the non-simple cases includedside-effecting code, like setting fields or calling void methods. InGoogle Style I expect that we will probably forbid (or at leaststrongly dissuade) side effects in expression switch. I shouldprobably bring this up separately, but I am pretty convinced by nowthat users should see expression switch and procedural switch as twocompletely different things, and by convention should always keep theformer purely functional.
Next there were 7 examples where a case was "non-simple" only becauseit was using the "log, then return a null object (or null), instead ofthrowing an exception" anti-pattern. I was surprised this was thatpopular. and another 2 that used the "log-and-also-throw" anti-pattern.
2 examples had a use-once local variable that saved a /little/ bit ofnesting. I wouldn't normally refactor these, but if expression switchhad no mechanism for multi-statement cases, I wouldn't think twiceabout it.
1 example had cases that looked nearly identical, 3 statements each,that could all be hoisted out of the switch, except that the typesthat differed across the three didn't implement a common interface (asthey clearly should have). Slightly compelling.
1 example had all simple cases except that one also wanted to check anassertion. Okay, slightly compelling.
Finally, the cases that were the most compelling to me: 3 examples hadone or more large cases, where factoring them out into helper methodswould imho be ugly because >=3 parameters would be required. Ifexpression switch didn't permit multi-statement cases, I would justkeep them as procedural switches. It's only 3 out of 42.
Summary:
imho, early signs suggest that the grossness of `break x` is not/nearly/ justified by the actual observed positive value of supportingmulti-statement cases in expression switch. Are we open to killingthat, or would we be if I produced more and clearer evidence?
    On Fri, Mar 9, 2018 at 2:56 PM Brian Goetz <brian.go...@oracle.com
    <mailto:brian.go...@oracle.com>> wrote:

        Did you happen to calculate what percentage was _not_ the
        "default" case?  I would expect that to be a considerable
        fraction.

        On 3/9/2018 5:49 PM, Kevin Bourrillion wrote:
        On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax <fo...@univ-mlv.fr
        <mailto:fo...@univ-mlv.fr>> wrote:

            When i asked what we should do instead, the answer is either:
              1/ we should not allow block of codes in the expression
            switch but only expression
              2/ that we should use the lambda syntax with return,
            even if the semantics is different from the lambda semantics.

            I do not like (1) because i think the expression switch
            will become useless


        In our (large) codebase, +Louis determined that, among switch
        statements that appear translatable to expression switch,
        13.8% of them seem to require at least one multi-statement case.
--
Kevin Bourrillion | Java Librarian | Google, Inc. |kev...@google.com<mailto:kev...@google.com>

Re: break seen as a C archaism

Reply via email to