I was starting to get fatalistically pessimistic about switch, but the all-colon-as-statement vs all-arrow-as-expression idea (with nothing in-between) seems pretty good! And would be even better if JLS impact were carefully checked.
-Doug On 04/19/2018 04:44 PM, Brian Goetz wrote: > We've been reviewing the work to date on switch expressions. Here's > where we are, and here's a possible place we might move to, which I like > a lot better than where we are now. > > ## Goals > > As a reminder, remember that the primary goal here is _not_ switch > expressions; switch expressions are supposed to just be an > uncontroversial waypoint on the way to the real goal, which is a more > expressive and flexible switch construct that works in a wider variety > of situations, including supporting patterns, being less hostile to > null, use as either an expression or a statement, etc. > > And the reason we think that improving switch is the right primary goal > is because a "do one of these based on ..." construct is _better_ than > the corresponding chain of if-else-if, for multiple reasons: > > - Possibility for the compiler to do exhaustiveness analysis, > potentially finding more bugs; > - Possibility for more efficient dispatch -- a switch could be O(1), > whereas an if-else chain is almost certainly O(n); > - More semantically transparent -- it's obvious the user is saying "do > one of these, based on ..."; > - Eliminates the need to repeat (and possibly get wrong) the switch > target. > > Switch does come with a lot of baggage (fallthrough by default, > questionable scoping, need to explicitly break), and this baggage has > produced the predictable distractions in the discussion -- a desire that > we subordinate the primary goal (making switch more expressive) to the > more contingent goal of "fixing" the legacy problems of switch. > > These legacy problems of switch may be unfortunate, but to whatever > degree we end up ameliorating these, this has to be purely a > side-benefit -- it's not the primarily goal, no matter how annoying > people find them. (The desire to "fix" the mistakes of the past is > frequently a siren song, which is why we don't allow ourselves to take > these as first-class requirements.) > > #### What we're not going to do > > The worst possible outcome (which is also the most commonly suggested > "solution" in forums like reddit) would be to invent a new construct > that is similar to, but not quite the same as switch (`snitch`), without > being a 100% replacement for today's quirky switch. Today's switch is > surely suboptimal, but it's not so fatally flawed that it needs to be > euthanized, and we don't want to create an "undead" language construct > forever, which everyone will still have to learn, and keep track of the > differences between `switch` and `snitch`. No thank you. > > That means we extend the existing switch statement, and increase > flexibility by supporting an expression form, and to the degree needed, > embrace its quirks. ("No statement left behind.") > > #### Where we started > > In the first five minutes of working on this project, we sketched out > the following (call it the "napkin sketch"), where an expression switch > has case arms of the form: > > case L -> e; > or > case L -> { statement*; break e; } > > This was enough to get started, but of course the devil is in the details. > > #### Where we are right now > > We moved away from the napkin sketch for a few reasons, in part because > it seemed to be drawing us down the road towards switch and snitch -- > which was further worrying as we still had yet to deal with the > potential that pattern switch and constant switch might have differences > as well. We want a unified model of switch that deals well enough with > all the cases -- expressions and statements, patterns and constants. > > Our current model (call this Unification Attempt #1, or UA1 for short) > is a step towards a unified model of switch, and this is a huge step > forward. In this model, there's one switch construct, and there's one > set of control flow rules, including for break (like return, break takes > a value in a value context and is void in a void context). > > For convenience and safety, we then layered a shorthand atop > value-bearing switches, which is to interpret > > case L -> e; > > as > > case L: break e; > > expecting the shorter form would be used almost all the time. (This has > a pleasing symmetry with the expression form of lambdas, and (at least > for expression switches) alleviates two of the legacy pain points. > Switch expressions have other things in common with lambdas too; they > are the only ones that can have statements; they are the only ones that > interact with nonlocal control flow.) > > This approach offers a lot of flexibility (some would say too much). > You can write "remi-style" expression switches: > > int x = switch (y) { > case 1: break 2; > case 2: break 4; > default: break 8; > }; > > or you can write "new-style" expression switches: > > int x = switch (y) { > case 1 -> 2; > case 2-> 4; > default-> 8; > }; > > Some people like the transparency of the first; others like the > compactness and fallthrough-safety of the second. And in cases where > you mostly want the benefits of the second, but the real world conspires > to make one or two cases difficult, you can mix them, and take full > advantage of what "old switch" does -- with no new rules for control flow. > > #### Complaints > > There were the usual array of complaints over syntax -- many of which > can be put down to "bleah, new is different, different is bad", but the > most prominent one seems to be a generalized concern that other users > (never us, of course, but we always fear for what others might do) won't > be able to "handle" the power of mixed switches and will write terrible > code, and then the world will burn. (And, because the mixing comes with > fallthrough, it further engenders the "you idiots, you fixed the wrong > thing" reactions.) Personally, I think the fear of mixing is deeply > overblown -- I think in most cases people will gravitate towards one of > the two clean styles, and only mix where the complexity of the real > world forces them to, but there's value in understanding the > underpinnings of such reactions, even if in the end they'd turn out to > be much hot air about nothing. > > #### A real issue with mixing! > > But, there is a real problem with our approach, which is: while a > unified switch is the right goal, UA1 is not unified _enough_. > Specifically, we haven't fully aligned the statement forms, and this > conspires to reduce expressiveness and safety. That is, in an > expression switch you can say: > > case L -> e; > > but in a statement switch you can't say > > case L -> s; > > The reason for this is a purely accidental one: if we allowed this, then > we _would_ likely find ourselves in the mixing hell that people are > afraid of, which in turn would make the risk of accidental fallthrough > _even worse_ than it is today. So the failing of mixing is not that it > will be abused, but that it constrains us from actually getting to a > unified construct. > > ## Closing the gap > > So, let's take one more step towards unifying the two forms (call this > UA2), rather than a step away from it. Let's say that _all_ switches > can support either old-style (colon) or new-style (arrow) case labels -- > but must stick to one kind of case label in a given switch: > > // statement switch > switch (x) { > case 1: println("one"); break; > case 2: println("two"); break; > } > > or > > // also statement switch > switch (x) { > case 1 -> println("one"); > case 2 -> println("two"); > } > > If a switch is a statement, the RHS is a statement, which can be a block > statement: > > case L -> { a; b; } > > We get there by first taking a step backwards, at least in terms of > superficial syntax, to the syntax suggested by the napkin sketch, where > if a switch is an expression, the RHS of an -> case is an expression or > a block statement (in the latter case, it must complete abruptly by > reason of either break-value or throw). Just as we expected "break > value" to be rare in expression switches under UA1 since developers will > generally prefer the shorthand form where applicable, we expect it to be > equally rare under UA2. > > Then, as in UA1, we render unto expressions the things that belong to > expressions; they must be total (an expression must yield a value or > complete abruptly by reason of throwing.) > > #### Look, accidental benefits! > > Many of switches failings (fallthrough, scoping) are not directly > specified features, as much as emergent properties of the structure and > control flow of switches. Since by definition you can't fall out of a > arrow case, then an all-arrow switch gives the fallthrough-haters what > they want "for free", with no need to treat it specially. In fact, its > even better; in the all-arrow form, all of the things people hate about > switch -- the need to say break, the risk of fallthrough, and the > questionable scoping -- all go away. > > #### Scorecard > > There is one switch construct, which can be use as either an expression > or a statement; when used as an expression, it acquires the > characteristics of expressions (must be total, no nonlocal control flow > out.) Each can be expressed in one of two syntactic forms (arrow and > colon.) All forms will support patterns, null handling, and multiple > labels per case. The control flow and scoping rules are driven by > structural properties of the chosen form. > > The (statement, colon) case is the switch we have since Java 1.0, > enhanced as above (patterns, nulls, etc.) > > The (statement, arrow) case can be considered a nice syntactic shorthand > for the previous, which obviates the annoyance of "break", implicitly > prevents fallthrough of all forms, and avoids the confusion of current > switch scoping. Many existing statement switches that are not > expressions in disguise can be refactored to this. > > The (expression, colon) form is a subset of UA1, where you just never > say "arrow". > > The (expression, arrow) case can again be considered a nice shorthand > for the previous, again a subset of UA1, where you just never say > "colon", and as a result, again don't have to think about fallthrough. > > Totality is a property of expression switches, regardless of form, > because they are expressions, and expressions must be total. > > Fallthrough is a property of the colon-structured switches; there are no > changes here. > > Nonlocal control flow _out_ of a switch (continue to an enclosing loop, > break with label, return) are properties of statement switches. > > So essentially, rather than dividing the semantics along > expression/statement lines, and then attempting to opportunistically > heap a bunch of irrelevant features like "no fallthrough" onto the > expression side "because they're cool" even though they have nothing to > do with expression-ness, we instead divide the world structurally: the > colon form gives you the old control flow, and the arrow form gives you > the new. And either can be used as a statement, or an expression. And > no one will be confused by mixing. > > Orthogonality FTW. No statement gets left behind. > > ## Explaining it > > Relative to UA1, we could describe this as adding back the blocks (its > not really a block expression) from the napkin model, supporting an > arrow form of statement switches with blocks too, and then restricting > switches to all-arrow or all-colon. Then each quadrant is a restriction > of this model. But that's not how we'd teach it. > > Relative to Java 10, we'd probably say: > > - Switch statements now come in a simpler (arrow) flavor, where there > is no fallthrough, no weird scoping, and no need to say break most of > the time. Many switches can be rewritten this way, and this form can > even be taught first. > - Switches can be used as either expressions or statements, with > essentially identical syntax (some grammar differences, but this is > mostly interesting only to spec writers). If a switch is an expression, > it should contain expressions; if a switch is a statement, it should > contain statements. > - Expression switches have additional restrictions that are derived > exclusively from their expression-ness: totality, can only complete > abruptly if by reason of throw. > - We allow a break-with-value statement in an expression switch as a > means of explicitly providing the switch result; this can be combined > with a statement block to allow for statements+break-expression. > > The result is one switch construct, with modern and legacy flavors, > which supports either expressions or statements. You can immediately > look at the middle of a switch and tell (by arrow vs colon) whether it > has the legacy control flow or not. > > >