Re: Letting the nulls flow (Was: Exhaustiveness)

Brian Goetz Sun, 23 Aug 2020 12:42:47 -0700

As I joked with Stephen on amber-dev, treating nulls specially inpatterns like this (stapling a null-permit onto a pattern) feels likesomething from the "Bargaining" stage of the Kubler-Ross scale: "OK,fine, I'll let the nulls flow past the "switch" gate, but I want eachnull to show me their permit before passing the "case" gate. I wouldreally like to get the the Acceptance stage, where we admit that `null`is something that needs to be treated uniformly. I think we're almostthere.

(As I mentioned in my other mail this morning, with this newnull-friendly disposition for switch, it seems possible that `case null`need not come first, since now the presence of `case null` is really theonly one that affects the overall behavior of switch -- and, only forvery specific switches. (I think that might have been some Bargainingtoo.) So if we think null-falling-into-default is desirable, then wecan relax the "case null must come first".)





On 8/23/2020 3:28 PM, fo...@univ-mlv.fr wrote:

I think we agree that switch should be able to be null-friendly,
the question is more what is the default, and how "default" works.

I wonder if "case null" is the right design, if for a lot of switch,null behavior is the same as the behavior of an existing case or default.Currently case null is the first (depending on nesting) case, so youcan not easily said that null and another case share the same behavior.

Whatever we decide for "default", a syntax that let append null to anexisting case seems better ?

Something along "case Foo, null: ... "

Rémi

------------------------------------------------------------------------

    *De: *"Brian Goetz" <brian.go...@oracle.com>
    *À: *"Tagir Valeev" <amae...@gmail.com>
    *Cc: *"Remi Forax" <fo...@univ-mlv.fr>, "Guy Steele"
    <guy.ste...@oracle.com>, "amber-spec-experts"
    <amber-spec-experts@openjdk.java.net>
    *Envoyé: *Dimanche 23 Août 2020 17:43:03
    *Objet: *Re: Letting the nulls flow (Was: Exhaustiveness)

    Thanks, Tagir -- this is a perfect example of what I meant
    yesterday by how the "blow early, blow often" approach is a false
    promise.  It just means that responsible programmers who need to
    deal with null as a fact-of-life have to do *extra* work (which is
    therefore more duplicative or error-prone) to deal with it.


    On 8/22/2020 11:46 PM, Tagir Valeev wrote:

        Hello!

        Some data from the current IntelliJ IDEA codebase

        We have 64 occurrences of this code pattern
        if($x$ == null) {...} // presumably completes abruptly
        switch($x) {...}
        Roughly half of them are enum switches and the other half is string 
switches

        Also, we have 29 occurrences of this code pattern:
        if($x$ != null) {
           switch($x$) { ... }
           ...
        }

        Also, we have one occurrence of this code pattern:
        if($x$ == null) {...
        } else {
           switch($x) {...}
        }

        All of them could benefit from null-friendly switch. Btw often null
        branch is the same as default branch (or some other non-null branch).

        With best regards,
        Tagir Valeev

        On Sun, Aug 23, 2020 at 12:14 AM Brian Goetz<brian.go...@oracle.com>  
wrote:

            Breaking into a separate thread.   I hope we can put this one to bed
            once and for all.

                I'm not hostile to that view, but may i ask an honest question, 
why
                this semantics is better ?
                Do you have examples where it makes sense to let the null to 
slip
                through the statement switch ? Because as i can see why being 
null
                hostile is a good default, it follows the motos "blow early, 
blow
                often" or "in case of doubt throws".

            Charitably, I think this approach is borne of a belief that, if we 
keep
            the nulls out by posting sentries at the door, we can live an 
interior
            life unfettered by stray nulls.  But I think it is also time to
            recognize that this approach to "block the nulls at the door" (a)
            doesn't actually work, (b) creates sharp edges when the doors move
            (which they do, though refactoring), and (c) pushes the problems 
elsewhere.

            (To illustrate (c), just look at the conversation about nulls in
            patterns and switch we are having right now!  We all came to this
            exercise thinking "switch is null-hostile, that's how it's always 
been,
            that's how it must be", and are contorting ourselves to try to come 
up
            with a consistent explanation.   But, if we look deeper, we see that
            switch is *only accidentally* null-hostile, based on some highly
            contextual decisions that were made when adding enum and autoboxing 
in
            Java 5.  I'll talk more about that decision in a moment, but my 
point
            right now is that we are doing a _lot_ of work to try to be 
consistent
            with an arbitrary decision that was made in the past, in a specific 
and
            limited context, and probably not with the greatest care.  Truly 
today's
            problems come from yesterdays "solutions."  If we weren't careful, 
an
            accidental decision about nulls in enum switch almost polluted the
            semantics of pattern matching!  That would be terrible!  So let's 
stop
            doing that, and let's stop creating new ways for our tomorrow's 
selves
            to be painted into a corner.)


            As background, I'll observe that every time a new context comes up,
            someone suggests "we should make it null-hostile."  (Closely 
related: we
            should make that new kind of variable immutable.)  And, nearly every
            time, this ends up being the wrong choice.  This happened with 
Streams;
            when we first wrestled with nulls in streams, someone pushed for 
"Just
            have streams throw on null elements."  But this would have been
            terrible; it would have meant that calculations on null-friendly
            domains, that were prepared to engage null directly, simply could 
not
            use streams in the obvious way; calculations like:

                  Stream.of(arrayOfStuff)
                              .map(Stuff::methodThatMightReturnNull)
                              .filter(x -> x != null)
                              .map(Stuff::doSomething)
                              .collect(toList())

            would not be directly expressible, because we would have already 
NPEed.
            Sure, there are workarounds, but for what?  Out of a naive hope 
that, if
            we inject enough null checks, no one will ever have to deal with 
null?
            Out of irrational hatred for nulls?  Nothing good comes from either 
of
            these motivations.

            But, this episode wasn't over.  It was then suggested "OK, we can't 
NPE,
            but how about we filter the nulls?"  Which would have been worse.  
It
            would mean that, for example, doing a map+toArray on an array might 
not
            have the same size as the initial array -- which would violate what
            should be a pretty rock-solid intuition.  It would kill all the
            pre-sized-array optimizations.  It would mean `zip` would have no 
useful
            semantics.  Etc etc.

            In the end, we came to the right answer for streams, which is "let 
the
            nulls flow".   And this is was the right choice because Streams is
            general-purpose plumbing.  The "blow early" bias is about guarding 
the
            gates, and thereby hopefully keeping the nulls from getting into the
            house and having wild null parties at our expense. And this works 
when
            the gates are few, fixed, and well marked.  But if your language
            exhibits any compositional mechanisms (which is our best tool), then
            what was the front door soon becomes the middle of the hallway 
after a
            trivial refactoring -- which means that no refactorings are really
            trivial.  Oof.

            We already went through a good example recently where it would be
            foolish to try to exclude null (and yet we tried anyway) --
            deconstruction patterns.  If a constructor

                  new Foo(x)

            can accept null, then a deconstructor

                  case Foo(var x)

            should dutifully serve up that null.  The guard-the-gates brigade 
tried
            valiently to put up new gates at each deconstructor, but that would 
have
            been a foolish place to put such a boundary.  I offered an analogy 
to
            having deconstruction reject null over on amber-dev:

                In languages with side-effects (like Java), not all aggregation
                operations are reversible; if I bake a pie, I can't later 
recover the
                apples and the sugar.  But many are, and we like abstractions 
like
                these (collections, Optional, stream, etc) because they are very
                useful and easily reasoned about.  So those that are, should 
commit to
                the principle.  It would be OK for a list implementation to 
behave
                like this:

                     Listy list = new Listy();
                     list.add(null) // throws NPE

                because a List is free to express constraints on its domain.  
But it
                would be exceedingly bizarre for a list implementation to 
behave like
                this:

                     Listy list = new Listy();
                     list.add(3);     // ok, I like ints
                     list.add(null); // ok, I like nulls too
                     assertTrue(list.size() == 2);   // ok
                     assertTrue(list.get(0) == 3); // ok
                     assertTrue(list.get(1) == null);  // NPE!

                If the list takes in nulls, it should give them back.

            Now, this is like the first suggested form of null-hostility in 
streams,
            and to everyone's credit, no one suggested exactly that, but what 
was
            suggested was the second, silent form of hostility -- just pretend 
you
            don't see the nulls.  And, like with streams, that would have been
            silly.  So, OK, we dodged the bullet of infecting patterns with 
special
            nullity rules.  Whew.

            Now, switch.  As I mentioned, I think we're here mostly because we 
are
            perpetuating the null biases of the past.  In Java 1.0, switches 
were
            only over primitives, so there was no question about nulls.  In 
Java 5,
            we added two new reference-typed switch targets: enums and boxes.  I
            wasn't in the room when that decision was made, but I can imagine 
how it
            went: Java 5 was a *very* full release, and under dramatic pressure 
to
            get out the door.  The discussion came up about nulls, maybe someone
            even suggested `case null` back then.  And I'm sure the answer was 
some
            form of "null enums and primitive boxes are almost always bugs, 
let's
            not bend over backwards and add new complexity to the language (case
            null) just to accomodate this bug, let's just throw NPE."

            And, given how limited switch was, and the special characteristics 
of
            enums and boxes, this was probably a pragmatic decision, but I 
think we
            lost sight of the subtleties of the context.  It is almost certainly
            right that 99.999% of the time, a null enum or box is a bug.  But 
this
            is emphatically not true when we broaden the type to Object.  Since 
the
            context and conditions change, the decision should be revisited 
before
            copying it to other contexts.

            In Java 7, when we added switching on strings, I do remember the
            discussion about nulls; it was mostly about "well, there's a 
precedent,
            and it's not worth breaking the precedent even if null strings are 
more
            common than null Integers, and besides, the mandate of Project Coin 
is
            very limited, and `case null` would probably be out of scope."  
While
            this may have again been a pragmatic choice at the time given the
            constraints, it further set us down a slippery slope where the
            assumption that "switches always throw null" is set in concrete.  
But
            this assumption is not founded on solid ground.

            So, the better way to approach this is to imagine Java had no 
switch,
            and we were adding a general switch today.  Would we really be
            advocating so hard for "Oooh, another door we can guard, let's 
stick it
            to the nulls there too"?  (And, even if we were tempted to, should 
we?)

            The plain fact is that we got away with null-hostility in the first
            three forms of reference types in switch because switch (at the 
time)
            was such a weak and non-compositional mechanism, and there are darn 
few
            things it can actually do well.  But, if we were designing a
            general-purpose switch, with rich labels and enhanced control flow
            (e.g., guards) as we are today, where we envisioned refactoring 
between
            switches on nested patterns and patterns with nested switches, this
            would be more like a general plumbing mechanism, like streams, and 
when
            plumbing has an opinion about the nulls, frantic calls to the 
plumber
            are not far behind.  The nulls must flow unimpeded, because 
otherwise,
            we create new anomalies and blockages like the streams examples I 
gave
            earlier and refactoring surprises. And having these anomalies 
doesn't
            really make life any better for the users -- it actually makes
            everything just less predictable, because it means simple 
refactorings
            are not simple -- and in a way that is very easy to forget about.

            If we really could keep the nulls out at the front gate, and thus 
define
            a clear null-free domain to work in, then I would be far more
            sympathetic to the calls of "new gates, new guards!"  But the gates
            approach just doesn't work, and we have ample evidence of this.  
And the
            richer and more compositional we make the language, the more sharp 
edges
            this creates, because old interiors become new gates.

            So, back to the case at hand (though we should bring specifics this 
back
            to the case-at-hand thread): what's happening here is our baby 
switch is
            growing up into a general purpose mechanism.  And, we should expect 
it
            to take on responsibilities suited to its new abilities.


            Now, for the backlash.  Whenever we make an argument for
            what-appears-to-be relaxing an existing null-hostility, there is 
much
            concern about how the nulls will run free and wreak havoc. But, 
let's
            examine that more closely.

            The concern seems to be that, if if we let the null through the 
gate,
            we'll just get more NPEs, at worse places.  Well, we can't get more
            NPEs; at most, we can get exactly the same number.  But in reality, 
we
            will likely get less.  There are three cases.

            1.  The domain is already null-free.  In this case, it doesn't make 
a
            difference; no NPEs before, none after.

            2.  The domain is mostly null-free, but nulls do creep in, we see 
them
            as bugs, and we are happy to get notified.  This is the case today 
with
            enums, where a null enum is almost always a bug.  Yes, in cases like
            this, not guarding the gates means that the bug will get further 
before
            it is detected, or might go undetected.  This isn't fantastic, but 
this
            also isn't a disaster, because it is rare and is still likely it 
will
            get detected eventually.

            3.  The domain is at least partially null tolerant.  Here, we are 
moving
            an always-throw at the gates to a
            might-throw-in-the-guts-if-you-forget.  But also, there are plenty 
of
            things you can do with a null binding that don't NPE, such as pass 
it to
            a method that deals sensibly with nulls, add it to an ArrayList, 
print
            it, etc.  This is a huge improvement, from "must treat null in a
            special, out of band way" to "treat null uniformly."  At worst, it 
is no
            worse, and often better.

            And, when it comes to general purpose domains, #3 is much bigger 
than
            #2.  So I think we have to optimize for #3.


            Finally, there are those who argue we should "just" have nullable 
types
            (T? and T!), and then all of this goes away.  I would love to get 
there,
            but it would be a very long road.  But let's imagine we do get 
there.
            OMG how terrible it would be when constructs like lambdas, 
switches, or
            patterns willfully try to save us from the nulls, thus doing the job
            (badly) of the type system!  We'd have explicitly nullable types for
            which some constructs NPE anyway. Or, we'd have to redefine the
            semantics of everything in complex ways based on whether the 
underlying
            input types are nullable or not.  We would feel pretty stupid for 
having
            created new corners to paint ourselves into.

            Our fears of untamed nulls wantonly running through the streets are
            overblown.  Our attempts to contain the nulls through ad-hoc
            gate-guarding have all been failures.  Let the nulls flow.

Re: Letting the nulls flow (Was: Exhaustiveness)

Reply via email to