I like this proposal and, in particular, I strongly support " ## Closing the gap" section. Enforcing uniform style on every particular switch allows to have clean and intuitive semantics for arrow switches while giving a straightforward migration path that can be assisted by tools to old-style ones.
-Dmitry On Thu, Apr 19, 2018 at 1:44 PM, Brian Goetz <brian.go...@oracle.com> wrote: > We've been reviewing the work to date on switch expressions. Here's where > we are, and here's a possible place we might move to, which I like a lot > better than where we are now. > > ## Goals > > As a reminder, remember that the primary goal here is _not_ switch > expressions; switch expressions are supposed to just be an uncontroversial > waypoint on the way to the real goal, which is a more expressive and > flexible switch construct that works in a wider variety of situations, > including supporting patterns, being less hostile to null, use as either an > expression or a statement, etc. > > And the reason we think that improving switch is the right primary goal is > because a "do one of these based on ..." construct is _better_ than the > corresponding chain of if-else-if, for multiple reasons: > > - Possibility for the compiler to do exhaustiveness analysis, potentially > finding more bugs; > - Possibility for more efficient dispatch -- a switch could be O(1), > whereas an if-else chain is almost certainly O(n); > - More semantically transparent -- it's obvious the user is saying "do > one of these, based on ..."; > - Eliminates the need to repeat (and possibly get wrong) the switch > target. > > Switch does come with a lot of baggage (fallthrough by default, > questionable scoping, need to explicitly break), and this baggage has > produced the predictable distractions in the discussion -- a desire that we > subordinate the primary goal (making switch more expressive) to the more > contingent goal of "fixing" the legacy problems of switch. > > These legacy problems of switch may be unfortunate, but to whatever degree > we end up ameliorating these, this has to be purely a side-benefit -- it's > not the primarily goal, no matter how annoying people find them. (The > desire to "fix" the mistakes of the past is frequently a siren song, which > is why we don't allow ourselves to take these as first-class requirements.) > > #### What we're not going to do > > The worst possible outcome (which is also the most commonly suggested > "solution" in forums like reddit) would be to invent a new construct that > is similar to, but not quite the same as switch (`snitch`), without being a > 100% replacement for today's quirky switch. Today's switch is surely > suboptimal, but it's not so fatally flawed that it needs to be euthanized, > and we don't want to create an "undead" language construct forever, which > everyone will still have to learn, and keep track of the differences > between `switch` and `snitch`. No thank you. > > That means we extend the existing switch statement, and increase > flexibility by supporting an expression form, and to the degree needed, > embrace its quirks. ("No statement left behind.") > > #### Where we started > > In the first five minutes of working on this project, we sketched out the > following (call it the "napkin sketch"), where an expression switch has > case arms of the form: > > case L -> e; > or > case L -> { statement*; break e; } > > This was enough to get started, but of course the devil is in the details. > > #### Where we are right now > > We moved away from the napkin sketch for a few reasons, in part because it > seemed to be drawing us down the road towards switch and snitch -- which > was further worrying as we still had yet to deal with the potential that > pattern switch and constant switch might have differences as well. We want > a unified model of switch that deals well enough with all the cases -- > expressions and statements, patterns and constants. > > Our current model (call this Unification Attempt #1, or UA1 for short) is > a step towards a unified model of switch, and this is a huge step forward. > In this model, there's one switch construct, and there's one set of control > flow rules, including for break (like return, break takes a value in a > value context and is void in a void context). > > For convenience and safety, we then layered a shorthand atop value-bearing > switches, which is to interpret > > case L -> e; > > as > > case L: break e; > > expecting the shorter form would be used almost all the time. (This has a > pleasing symmetry with the expression form of lambdas, and (at least for > expression switches) alleviates two of the legacy pain points. Switch > expressions have other things in common with lambdas too; they are the only > ones that can have statements; they are the only ones that interact with > nonlocal control flow.) > > This approach offers a lot of flexibility (some would say too much). You > can write "remi-style" expression switches: > > int x = switch (y) { > case 1: break 2; > case 2: break 4; > default: break 8; > }; > > or you can write "new-style" expression switches: > > int x = switch (y) { > case 1 -> 2; > case 2-> 4; > default-> 8; > }; > > Some people like the transparency of the first; others like the > compactness and fallthrough-safety of the second. And in cases where you > mostly want the benefits of the second, but the real world conspires to > make one or two cases difficult, you can mix them, and take full advantage > of what "old switch" does -- with no new rules for control flow. > > #### Complaints > > There were the usual array of complaints over syntax -- many of which can > be put down to "bleah, new is different, different is bad", but the most > prominent one seems to be a generalized concern that other users (never us, > of course, but we always fear for what others might do) won't be able to > "handle" the power of mixed switches and will write terrible code, and then > the world will burn. (And, because the mixing comes with fallthrough, it > further engenders the "you idiots, you fixed the wrong thing" reactions.) > Personally, I think the fear of mixing is deeply overblown -- I think in > most cases people will gravitate towards one of the two clean styles, and > only mix where the complexity of the real world forces them to, but there's > value in understanding the underpinnings of such reactions, even if in the > end they'd turn out to be much hot air about nothing. > > #### A real issue with mixing! > > But, there is a real problem with our approach, which is: while a unified > switch is the right goal, UA1 is not unified _enough_. Specifically, we > haven't fully aligned the statement forms, and this conspires to reduce > expressiveness and safety. That is, in an expression switch you can say: > > case L -> e; > > but in a statement switch you can't say > > case L -> s; > > The reason for this is a purely accidental one: if we allowed this, then > we _would_ likely find ourselves in the mixing hell that people are afraid > of, which in turn would make the risk of accidental fallthrough _even > worse_ than it is today. So the failing of mixing is not that it will be > abused, but that it constrains us from actually getting to a unified > construct. > > ## Closing the gap > > So, let's take one more step towards unifying the two forms (call this > UA2), rather than a step away from it. Let's say that _all_ switches can > support either old-style (colon) or new-style (arrow) case labels -- but > must stick to one kind of case label in a given switch: > > // statement switch > switch (x) { > case 1: println("one"); break; > case 2: println("two"); break; > } > > or > > // also statement switch > switch (x) { > case 1 -> println("one"); > case 2 -> println("two"); > } > > If a switch is a statement, the RHS is a statement, which can be a block > statement: > > case L -> { a; b; } > > We get there by first taking a step backwards, at least in terms of > superficial syntax, to the syntax suggested by the napkin sketch, where if > a switch is an expression, the RHS of an -> case is an expression or a > block statement (in the latter case, it must complete abruptly by reason of > either break-value or throw). Just as we expected "break value" to be rare > in expression switches under UA1 since developers will generally prefer the > shorthand form where applicable, we expect it to be equally rare under UA2. > > Then, as in UA1, we render unto expressions the things that belong to > expressions; they must be total (an expression must yield a value or > complete abruptly by reason of throwing.) > > #### Look, accidental benefits! > > Many of switches failings (fallthrough, scoping) are not directly > specified features, as much as emergent properties of the structure and > control flow of switches. Since by definition you can't fall out of a > arrow case, then an all-arrow switch gives the fallthrough-haters what they > want "for free", with no need to treat it specially. In fact, its even > better; in the all-arrow form, all of the things people hate about switch > -- the need to say break, the risk of fallthrough, and the questionable > scoping -- all go away. > > #### Scorecard > > There is one switch construct, which can be use as either an expression or > a statement; when used as an expression, it acquires the characteristics of > expressions (must be total, no nonlocal control flow out.) Each can be > expressed in one of two syntactic forms (arrow and colon.) All forms will > support patterns, null handling, and multiple labels per case. The control > flow and scoping rules are driven by structural properties of the chosen > form. > > The (statement, colon) case is the switch we have since Java 1.0, enhanced > as above (patterns, nulls, etc.) > > The (statement, arrow) case can be considered a nice syntactic shorthand > for the previous, which obviates the annoyance of "break", implicitly > prevents fallthrough of all forms, and avoids the confusion of current > switch scoping. Many existing statement switches that are not expressions > in disguise can be refactored to this. > > The (expression, colon) form is a subset of UA1, where you just never say > "arrow". > > The (expression, arrow) case can again be considered a nice shorthand for > the previous, again a subset of UA1, where you just never say "colon", and > as a result, again don't have to think about fallthrough. > > Totality is a property of expression switches, regardless of form, because > they are expressions, and expressions must be total. > > Fallthrough is a property of the colon-structured switches; there are no > changes here. > > Nonlocal control flow _out_ of a switch (continue to an enclosing loop, > break with label, return) are properties of statement switches. > > So essentially, rather than dividing the semantics along > expression/statement lines, and then attempting to opportunistically heap a > bunch of irrelevant features like "no fallthrough" onto the expression side > "because they're cool" even though they have nothing to do with > expression-ness, we instead divide the world structurally: the colon form > gives you the old control flow, and the arrow form gives you the new. And > either can be used as a statement, or an expression. And no one will be > confused by mixing. > > Orthogonality FTW. No statement gets left behind. > > ## Explaining it > > Relative to UA1, we could describe this as adding back the blocks (its not > really a block expression) from the napkin model, supporting an arrow form > of statement switches with blocks too, and then restricting switches to > all-arrow or all-colon. Then each quadrant is a restriction of this > model. But that's not how we'd teach it. > > Relative to Java 10, we'd probably say: > > - Switch statements now come in a simpler (arrow) flavor, where there is > no fallthrough, no weird scoping, and no need to say break most of the > time. Many switches can be rewritten this way, and this form can even be > taught first. > - Switches can be used as either expressions or statements, with > essentially identical syntax (some grammar differences, but this is mostly > interesting only to spec writers). If a switch is an expression, it should > contain expressions; if a switch is a statement, it should contain > statements. > - Expression switches have additional restrictions that are derived > exclusively from their expression-ness: totality, can only complete > abruptly if by reason of throw. > - We allow a break-with-value statement in an expression switch as a > means of explicitly providing the switch result; this can be combined with > a statement block to allow for statements+break-expression. > > The result is one switch construct, with modern and legacy flavors, which > supports either expressions or statements. You can immediately look at the > middle of a switch and tell (by arrow vs colon) whether it has the legacy > control flow or not. > > >