Hello! Some data from the current IntelliJ IDEA codebase
We have 64 occurrences of this code pattern if($x$ == null) {...} // presumably completes abruptly switch($x) {...} Roughly half of them are enum switches and the other half is string switches Also, we have 29 occurrences of this code pattern: if($x$ != null) { switch($x$) { ... } ... } Also, we have one occurrence of this code pattern: if($x$ == null) {... } else { switch($x) {...} } All of them could benefit from null-friendly switch. Btw often null branch is the same as default branch (or some other non-null branch). With best regards, Tagir Valeev On Sun, Aug 23, 2020 at 12:14 AM Brian Goetz <brian.go...@oracle.com> wrote: > > Breaking into a separate thread. I hope we can put this one to bed > once and for all. > > > I'm not hostile to that view, but may i ask an honest question, why > > this semantics is better ? > > Do you have examples where it makes sense to let the null to slip > > through the statement switch ? Because as i can see why being null > > hostile is a good default, it follows the motos "blow early, blow > > often" or "in case of doubt throws". > > Charitably, I think this approach is borne of a belief that, if we keep > the nulls out by posting sentries at the door, we can live an interior > life unfettered by stray nulls. But I think it is also time to > recognize that this approach to "block the nulls at the door" (a) > doesn't actually work, (b) creates sharp edges when the doors move > (which they do, though refactoring), and (c) pushes the problems elsewhere. > > (To illustrate (c), just look at the conversation about nulls in > patterns and switch we are having right now! We all came to this > exercise thinking "switch is null-hostile, that's how it's always been, > that's how it must be", and are contorting ourselves to try to come up > with a consistent explanation. But, if we look deeper, we see that > switch is *only accidentally* null-hostile, based on some highly > contextual decisions that were made when adding enum and autoboxing in > Java 5. I'll talk more about that decision in a moment, but my point > right now is that we are doing a _lot_ of work to try to be consistent > with an arbitrary decision that was made in the past, in a specific and > limited context, and probably not with the greatest care. Truly today's > problems come from yesterdays "solutions." If we weren't careful, an > accidental decision about nulls in enum switch almost polluted the > semantics of pattern matching! That would be terrible! So let's stop > doing that, and let's stop creating new ways for our tomorrow's selves > to be painted into a corner.) > > > As background, I'll observe that every time a new context comes up, > someone suggests "we should make it null-hostile." (Closely related: we > should make that new kind of variable immutable.) And, nearly every > time, this ends up being the wrong choice. This happened with Streams; > when we first wrestled with nulls in streams, someone pushed for "Just > have streams throw on null elements." But this would have been > terrible; it would have meant that calculations on null-friendly > domains, that were prepared to engage null directly, simply could not > use streams in the obvious way; calculations like: > > Stream.of(arrayOfStuff) > .map(Stuff::methodThatMightReturnNull) > .filter(x -> x != null) > .map(Stuff::doSomething) > .collect(toList()) > > would not be directly expressible, because we would have already NPEed. > Sure, there are workarounds, but for what? Out of a naive hope that, if > we inject enough null checks, no one will ever have to deal with null? > Out of irrational hatred for nulls? Nothing good comes from either of > these motivations. > > But, this episode wasn't over. It was then suggested "OK, we can't NPE, > but how about we filter the nulls?" Which would have been worse. It > would mean that, for example, doing a map+toArray on an array might not > have the same size as the initial array -- which would violate what > should be a pretty rock-solid intuition. It would kill all the > pre-sized-array optimizations. It would mean `zip` would have no useful > semantics. Etc etc. > > In the end, we came to the right answer for streams, which is "let the > nulls flow". And this is was the right choice because Streams is > general-purpose plumbing. The "blow early" bias is about guarding the > gates, and thereby hopefully keeping the nulls from getting into the > house and having wild null parties at our expense. And this works when > the gates are few, fixed, and well marked. But if your language > exhibits any compositional mechanisms (which is our best tool), then > what was the front door soon becomes the middle of the hallway after a > trivial refactoring -- which means that no refactorings are really > trivial. Oof. > > We already went through a good example recently where it would be > foolish to try to exclude null (and yet we tried anyway) -- > deconstruction patterns. If a constructor > > new Foo(x) > > can accept null, then a deconstructor > > case Foo(var x) > > should dutifully serve up that null. The guard-the-gates brigade tried > valiently to put up new gates at each deconstructor, but that would have > been a foolish place to put such a boundary. I offered an analogy to > having deconstruction reject null over on amber-dev: > > > In languages with side-effects (like Java), not all aggregation > > operations are reversible; if I bake a pie, I can't later recover the > > apples and the sugar. But many are, and we like abstractions like > > these (collections, Optional, stream, etc) because they are very > > useful and easily reasoned about. So those that are, should commit to > > the principle. It would be OK for a list implementation to behave > > like this: > > > > Listy list = new Listy(); > > list.add(null) // throws NPE > > > > because a List is free to express constraints on its domain. But it > > would be exceedingly bizarre for a list implementation to behave like > > this: > > > > Listy list = new Listy(); > > list.add(3); // ok, I like ints > > list.add(null); // ok, I like nulls too > > assertTrue(list.size() == 2); // ok > > assertTrue(list.get(0) == 3); // ok > > assertTrue(list.get(1) == null); // NPE! > > > > If the list takes in nulls, it should give them back. > > Now, this is like the first suggested form of null-hostility in streams, > and to everyone's credit, no one suggested exactly that, but what was > suggested was the second, silent form of hostility -- just pretend you > don't see the nulls. And, like with streams, that would have been > silly. So, OK, we dodged the bullet of infecting patterns with special > nullity rules. Whew. > > Now, switch. As I mentioned, I think we're here mostly because we are > perpetuating the null biases of the past. In Java 1.0, switches were > only over primitives, so there was no question about nulls. In Java 5, > we added two new reference-typed switch targets: enums and boxes. I > wasn't in the room when that decision was made, but I can imagine how it > went: Java 5 was a *very* full release, and under dramatic pressure to > get out the door. The discussion came up about nulls, maybe someone > even suggested `case null` back then. And I'm sure the answer was some > form of "null enums and primitive boxes are almost always bugs, let's > not bend over backwards and add new complexity to the language (case > null) just to accomodate this bug, let's just throw NPE." > > And, given how limited switch was, and the special characteristics of > enums and boxes, this was probably a pragmatic decision, but I think we > lost sight of the subtleties of the context. It is almost certainly > right that 99.999% of the time, a null enum or box is a bug. But this > is emphatically not true when we broaden the type to Object. Since the > context and conditions change, the decision should be revisited before > copying it to other contexts. > > In Java 7, when we added switching on strings, I do remember the > discussion about nulls; it was mostly about "well, there's a precedent, > and it's not worth breaking the precedent even if null strings are more > common than null Integers, and besides, the mandate of Project Coin is > very limited, and `case null` would probably be out of scope." While > this may have again been a pragmatic choice at the time given the > constraints, it further set us down a slippery slope where the > assumption that "switches always throw null" is set in concrete. But > this assumption is not founded on solid ground. > > So, the better way to approach this is to imagine Java had no switch, > and we were adding a general switch today. Would we really be > advocating so hard for "Oooh, another door we can guard, let's stick it > to the nulls there too"? (And, even if we were tempted to, should we?) > > The plain fact is that we got away with null-hostility in the first > three forms of reference types in switch because switch (at the time) > was such a weak and non-compositional mechanism, and there are darn few > things it can actually do well. But, if we were designing a > general-purpose switch, with rich labels and enhanced control flow > (e.g., guards) as we are today, where we envisioned refactoring between > switches on nested patterns and patterns with nested switches, this > would be more like a general plumbing mechanism, like streams, and when > plumbing has an opinion about the nulls, frantic calls to the plumber > are not far behind. The nulls must flow unimpeded, because otherwise, > we create new anomalies and blockages like the streams examples I gave > earlier and refactoring surprises. And having these anomalies doesn't > really make life any better for the users -- it actually makes > everything just less predictable, because it means simple refactorings > are not simple -- and in a way that is very easy to forget about. > > If we really could keep the nulls out at the front gate, and thus define > a clear null-free domain to work in, then I would be far more > sympathetic to the calls of "new gates, new guards!" But the gates > approach just doesn't work, and we have ample evidence of this. And the > richer and more compositional we make the language, the more sharp edges > this creates, because old interiors become new gates. > > So, back to the case at hand (though we should bring specifics this back > to the case-at-hand thread): what's happening here is our baby switch is > growing up into a general purpose mechanism. And, we should expect it > to take on responsibilities suited to its new abilities. > > > Now, for the backlash. Whenever we make an argument for > what-appears-to-be relaxing an existing null-hostility, there is much > concern about how the nulls will run free and wreak havoc. But, let's > examine that more closely. > > The concern seems to be that, if if we let the null through the gate, > we'll just get more NPEs, at worse places. Well, we can't get more > NPEs; at most, we can get exactly the same number. But in reality, we > will likely get less. There are three cases. > > 1. The domain is already null-free. In this case, it doesn't make a > difference; no NPEs before, none after. > > 2. The domain is mostly null-free, but nulls do creep in, we see them > as bugs, and we are happy to get notified. This is the case today with > enums, where a null enum is almost always a bug. Yes, in cases like > this, not guarding the gates means that the bug will get further before > it is detected, or might go undetected. This isn't fantastic, but this > also isn't a disaster, because it is rare and is still likely it will > get detected eventually. > > 3. The domain is at least partially null tolerant. Here, we are moving > an always-throw at the gates to a > might-throw-in-the-guts-if-you-forget. But also, there are plenty of > things you can do with a null binding that don't NPE, such as pass it to > a method that deals sensibly with nulls, add it to an ArrayList, print > it, etc. This is a huge improvement, from "must treat null in a > special, out of band way" to "treat null uniformly." At worst, it is no > worse, and often better. > > And, when it comes to general purpose domains, #3 is much bigger than > #2. So I think we have to optimize for #3. > > > Finally, there are those who argue we should "just" have nullable types > (T? and T!), and then all of this goes away. I would love to get there, > but it would be a very long road. But let's imagine we do get there. > OMG how terrible it would be when constructs like lambdas, switches, or > patterns willfully try to save us from the nulls, thus doing the job > (badly) of the type system! We'd have explicitly nullable types for > which some constructs NPE anyway. Or, we'd have to redefine the > semantics of everything in complex ways based on whether the underlying > input types are nullable or not. We would feel pretty stupid for having > created new corners to paint ourselves into. > > Our fears of untamed nulls wantonly running through the streets are > overblown. Our attempts to contain the nulls through ad-hoc > gate-guarding have all been failures. Let the nulls flow. >