Re: [patterns] Nullability in patterns, and pattern-aware constructs (again)

Brian Goetz Fri, 10 Jan 2020 12:01:47 -0800

Closing the loop, this raises the question of "what about instanceof andtotal patterns"? I posit that the following locutions are silly:


    if (e instanceof var x) { ... }    // always true
    if (e instanceof _) { ... }        // always true


and probably should be banned.  If we did, though, what about:

    if (e instanceof Object o) { ... } // always true
if (e instanceof Object) { ... }   // always true, but currently allowed

I would think we would ban the former as well, but we have to keep thelatter around for compatibility. (Which is partially why I discouragedcalling the latter an "anonymous pattern" in the spec, and insteadproposed to treat it as a different flavor of `instanceof`.)

SO, proposed: disallow "any" patterns (_, var x, or total T x) ininstanceof. Instanceof is for partial patterns.


Note that

    Point p;
    if (p instanceof Point(var x, var y)) { }

is total, but we would't want to disallow it, as this pattern couldstill fail if p == null.

We might want to go a little further, and ban constant patterns ininstanceof too, since all of the following have simpler forms:


    if (x instanceof null) { ... }
    if (x instanceof "") { ... }
    if (i instanceof 3) { ... }

Or not -- I suspect not.



On 1/8/2020 3:27 PM, Brian Goetz wrote:

In the past, we've gone around a few times on nullability and patternmatching. Back when we were enamored of `T?` types over in Valhallaland, we tentatively landed on using `T?` also for nullable typepatterns. But the bloom came off that rose pretty quickly, andValhalla is moving away from it, and that makes it far less attractivein this context.
There are a number of tangled concerns that we've tried a few times tounknot:
- Construct nullability. Constructs to which we want to add patternawareness (instanceof, switch) already have their own opinion aboutnulls. Instanceof always says false when presented with a null, andswitch always NPEs.
- Pattern nullability. Some patterns clearly would never match null(deconstruction patterns), whereas others (an "any" pattern, andsurely the `null` constant pattern, if there was one) might make senseto match null.
- Nesting vs top-level. Most of the time, we don't want to matchnull at the top level, but frequently in a nested position we do. Thisconflicts with...
- Totality vs partiality. When a pattern is partial on the operandtype (e.g., `case String` when the operand of switch is `Object`), itis almost never the case we want to match null (well, except for the`null` constant pattern), whereas when a pattern is total on theoperand type (e.g., `case Object` in the same example), it is morejustifiable to match null.
- Refactoring friendliness. There are a number of cases that wewould like to freely refactor back and forth (e.g., if-instanceofchain vs pattern switch). In particular, refactoring a switch onnested patterns to a nested switch (case Foo(T t), case Foo(U u) to anested switch on T and U) is problematic under some of theinterpretations of nested patterns.
- Inference. It would be nice if a `var` pattern were simplyinference for a type pattern, rather than some possibly-non-denotableunion. (Both Scala and C# treat these differently, which means youhave to choose between type inference and the desired semantics; Idon't want to put users in the position of making this choice.)
Let's try (again) to untangle these.  A compelling example is this one:

    Box box;
    switch (box) {
        case Box(Chocolate c):
        case Box(Frog f):
        case Box(var o):
    }
It would be highly confusing and error-prone for either of the firsttwo patterns to match Box(null) -- given that Chocolate and Frog haveno type relation (ok, maybe they both implement `Edible`), it shouldbe perfectly safe to reorder the two. But, because the last patternis so obviously total on boxes, it is quite likely that what theauthor wants is to match all remaining boxes, including those thatcontain null. (Further, it would be super-bad if there were _no_way tosay "Match any Box, even if it contains null. While one might thinkthis could be repaired with OR patterns, imagine that `Box` had Ncomponents -- we'd need to OR together 2^n patterns, with complexmerging, to express all the possible combinations of nullity.)
Scala and C# took the path of saying that "var" patterns are not justtype inference, they are "any" patterns -- so `Box(Object o)` matchesboxes containing a non-null payload, where `Box(var o)` matches allboxes. I find this choice to be both questionable (the story that`var` is just inference is nice) and also that it puts users in theposition of having to choose between the semantics they want and beingexplicit about types. I see the expedience of it, but I do not thinkthis is the right answer for Java.
In the previous round, we posited that there were _typepatterns_(denoted `T t`) and _nullable type patterns_(denoted `T? t`),which had the advantage that you could be explicit about what youwanted (nulls or not), and which was sort of banking on Valhallaplunking for the `T? ` notation. But without that, only having `T?`in patterns, and no where else, will stick out like a sore thumb.
There are many ways to denote "T or null", of course:

 - Union types: `case (T|Null) t`
- OR patterns: `case (T t) | (Null t)`, or `case (T t) | (null t)`(the former is a union with a null TYPE pattern, the latter with anull CONSTANT pattern)
 - Merging/fallthrough: `case T t, Null t`
- Some way to spell "nullable T": `case T? t`, `case nullable T t`,`case T|null t`
But, I don't see any of these as being all that attractive in the Boxcase, when the most likely outcome is that the user wants the lastcase to match all boxes.
Here's a scheme that I think is workable, which we hovered nearsometime in the past, and which I want to go back to. We'll start withthe observation that `instanceof` and `switch` are currently hostileto nulls (instanceof says false, switch throws, and probably in thefuture, let/bind will do so also.)
- We accept that some constructs may have legacy hostility to nulls(but, see below for a possible relaxation);
 - There are no "nullable type patterns", just type patterns;
- Type patterns that are _total_ on their target operand (`case T` onan operand of type `U`, where `U <: T`) match null, and non-total typepatterns do not. - Var patterns can be considered "just type inference" and will meanthe same thing as a type pattern for the inferred type.
In this world, the patterns that match null (if the construct allowsit through) are `case null` and the total patterns -- which could bewritten `var x` (and maybe `_`, or maybe not), or `Object x`, or evena narrower type if the operand type is narrower.
In our Box example, this means that the last case (whether written as`Box(var o)` or `Box(Object o)`) matches all boxes, including thosecontaining null (because the nested pattern is total on the nestedoperand), but the first two cases do not.
An objection raised against this scheme earlier is that readers willhave to look at the declaration site of the pattern to know whetherthe nested pattern is total. This is a valid concern (to be traded offagainst the other valid concerns), but this does not seem so bad inpractice to me -- it will be common to use var or other broad type, inwhich case it will be obvious.)
One problem with this interpretation is that we can't triviallyrefactor from
    switch (o) {
        case Box(Chocolate c):
        case Box(Frog f):
        case Box(var o):
    }

to

    switch (o) {
        case Box(var contents):
            switch (contents) {
                case Chocolate c:
                case Frog f:
                case Object o:
            }
        }
    }
because the inner `switch(contents)` would NPE, because switch isnull-hostile. Instead, the user would explicitly have to do an `if(contents == null)` test, and, if the intent was to handle null in thesame way as the bottom case, some duplication of code would beneeded. This is irritating, but I don't think it is disqualifying --it is in the same category of null irritants that we have throughoutthe language.
Similarly, we lose the pleasing decomposition that the nested pattern`P(Q)` is the same pattern as `P(alpha) & alpha instanceof Q` when P's1st component might be null and the pattern Q is total -- because ofthe existing null-hostility of `instanceof`. (This is not unlike thecomplaint that Optional doesn't follow the monad law, with a similarconsequence -- and a similar justification.)
So, summary:
 - the null constant pattern matches null;
 - "any" patterns match null;
 - A total type pattern is an "any" pattern;
 - var is just type inference;
 - no other patterns match null;
 - existing constructs retain their existing null behaviors.


I'll follow up with a separate message about switch null-hostility.

Re: [patterns] Nullability in patterns, and pattern-aware constructs (again)

Reply via email to