> De: "Brian Goetz" <brian.go...@oracle.com> > À: "Tagir Valeev" <amae...@gmail.com> > Cc: "amber-spec-experts" <amber-spec-experts@openjdk.java.net> > Envoyé: Jeudi 20 Août 2020 21:09:00 > Objet: Re: [pattern-switch] Exhaustiveness
> Here's an attempt at a formalism for capturing this. > There are several categories of patterns we might call total on a type T. We > could refine the taxonomy as: > - Strongly total -- matches all values of T. > - Weakly total -- matches all values of T except perhaps null. > What we want to do is characterize the aggregate totality on T of a _set_ of > patterns P*. A set of patterns could in the aggregate be either of the above, > or also: > - Optimistically total -- matches all values of subtypes of T _known at > compile > time_, except perhaps null. > Note that we have an ordering: > partial < optimistically total < weakly total < strongly total > Now, some rules about defining the totality of a set of patterns. > T-Total: The singleton set containing the type pattern `T t` is strongly total > on U <: T. (This is the rule we've been discussing, but that's not the point > of > this mail -- we just need a base case right now.) > T-Subset: If a set of patterns P* contains a subset of patterns that is > X-total > on T, then P* is X-total on T. > T-Sealed: If T is sealed to U* (direct subtypes only), and for each U in U*, > there is some subset of P* that is optimistically total on U, then P* is > optimistically total on T. > T-Nested: Given a deconstructor D(U) and a collection of patterns { P1..Pn }, > if > { P1..Pn } is X-total on U, then { D(P1)..D(Pn) } is min(X,weak)-total on D. > OK, examples. Let's say > Container<T> = Box<T> | Bag<T> > Shape = Round | Rect > Round = Circle | Ellipse > { Container c }: total on Container by T-Total. > { Box b, Bag b }: optimistically total on Container > - Container sealed to Box and Bag > - `Box b` total on Box, `Bag b` total on Bag > - { Box b, Bag b } optimistically total on Container by T-Sealed > { Box(Round r), Box(Rect r) }: optimistically total on Box<Shape> > - Box sealed to Round and Rect > - { Round r, Rect r } optimistically total on Shape by T-Sealed > - { Box(Round r), Box(Rect r) } optimistically total on Box<Shape> by T-Nested > { Box(Object o) } weakly total on Box<Object> > - Object o total on Object > - { Object o } total on Object by T-Subset > - { Box(Object o) } weakly total on Box<Object> by T-Nested > { Box(Rect r), Box(Circle c), Box(Ellipse e) } optimistically total on > Box<Shape> > - Shape sealed to Round and Rect > - { Rect r } total on Rect > - { Circle c, Ellipse e } optimistically total on Round > - { Rect r, Circle c, Ellipse e } is optimistically total on Shape, because > for > each of { Rect, Round }, there is a subset that is optimistically total on > that > type > - { Box(Rect r), Box(Circle c), Box(Ellipse e) } optimistically total on Box > by > T-Nested > We can enhance this model to construct the residue (a characterization of what > falls through the cracks), and therefore has to be handled by a catch-all in a > putatively total switch. A grammar for the residue would be: > R := null | novel | D(R) > so might includes Box(null), Box(novel), Bag(Box(novel)), etc. We would need > to > extend this to support deconstructors with multiple bindings too. > OK, coming back to reality: > - The patterns of a switch expression must be at least optimistically total. > - The translation of a switch expression must include a synthetic case that > catches all elements of the residue of its patterns, and throws the > appropriate > exceptions: > - NPE for a null > - ICCE for a novel value > - One of the above, or maybe something else, for D(novel), D(null), D(E(novel, > null)), etc > We still have not addressed how we might nominate a _statement_ switch as > being > some form of total; that's a separate story. > Also a separate story: under what conditions in the new world do switches > throw > NPE, but this seems like progress. > Given the weird shape of the residue, it's not clear there's a clean way to > extrapolate the NPE|ICCE rule, since we might have Foo(null, novel), and would > arbitrarily have to pick which exception to throw, and neither would really be > all that great. Perhaps there's a new exception type lurking here. I disagree here, if me move the null checks upfront the type check a NPE will always be thrown before a ICCE on the same pattern which is the usual rule for any JDK methods (if you see a pattern matching as an equivalent of a cascade of instanceof + method call). Rémi > On 8/20/2020 12:58 PM, Tagir Valeev wrote: >> Hello! >>> Are we in agreement on what _should_ happen in these cases? >> This sounds like a good idea. I see the following alternatives (not >> saying that they are better, just putting them on the table). >> 1. Do not perform exhaustiveness analysis for nested patterns, except >> if all nested components are total. So, in your example, either >> explicit default, or case Box and case Bag will be necessary. This is >> somewhat limiting but I'm not sure that the exhaustiveness of complex >> nested patterns would be very popular. If one really needs this, they >> could use nested switch expression: >> return switch(container) { >> case Box(var s) -> switch(s) {case Rect r -> ...; case Circle c -> >> ...; /*optional case null is possible*/}; >> case Bag(var s) -> switch(s) {case Rect r -> ...; case Circle c -> >> ...; /*optional case null is possible*/}; >> } >> Here Box(var s) has total nested component, so it matches everything >> that Box matches, the same with Bag, thus we perform exhaustiveness >> analysis using Container declaration. >> This approach does not close the door for us. We can rethink later and >> add exhaustiveness analysis for nested patterns when we will finally >> determine how it should look like. Note that this still allows making >> Optional.of(var x) + Optional.empty exhaustive if we provide an >> appropriate mechanism to declare this kind of patterns. >> 2. Allow deconstructors and records to specify whether null is a >> possible value for a given component. Like make deconstructors and >> records null-hostile by default and provide a syntax (T? or T|null or >> whatever) to allow nulls. In this case, if the deconstructor is >> null-friendly, then the exhaustive pattern must handle Box(null). >> Otherwise, Box(null) is a compilation error. Yes, I know, this may >> quickly evolve to the point when we will need to add a full-fledged >> nullability to the type system. But probably it's possible to allow >> nullability specification for new language features only? Ok, ok, >> sorry, my imagination drove me too far away from reality. Forget it. >> With best regards, >> Tagir Valeev. >> On Thu, Aug 20, 2020 at 10:57 PM Brian Goetz [ mailto:brian.go...@oracle.com >> | >> <brian.go...@oracle.com> ] wrote: >>> Tagir's question about exhaustiveness in switches points to some technical >>> debt >>> left over from expression switches. >>> (Note: this entire discussion has nothing to do with whether `case Object >>> o` is >>> nullable; it has strictly to do with extending the existing treatment of >>> exhaustive switches over enums to sealed classes, when we can conclude a >>> switch >>> over such a type is total without a default / total case, and what implicit >>> cases we have to insert to make up for that. Please let's not conflate this >>> thread with that issue.) >>> When we did expression switch, for an exhaustive switch that covered all the >>> enums without a default, we inserted an extra catch-all case that throws >>> ICCE, >>> on the theory that nulls are already checked by the switch and so anything >>> that >>> hits the synthetic default must be a novel enum value, which merits an ICCE. >>> This worked for enum switches (where all case labels are discrete values), >>> but >>> doesn't quite scale to sealed types. Let's fix that. >>> As a recap, suppose we have >>> enum E { A, B; } >>> and suppose that, via separate compilation, a novel value C is introduced >>> that >>> was unknown at the time the switch was compiled. >>> An "exhaustive" statement switch on E: >>> switch (e) { >>> case A: >>> case B: >>> } >>> throws NPE on null but does nothing on C, because switch statements make no >>> attempt at being exhaustive. >>> An _expression_ switch that is deemed exhaustive without a default case: >>> var s = switch (e) { >>> case A -> ... >>> case B -> ... >>> } >>> throws NPE on null and ICCE on C. >>> At the time, we were concerned about the gap between statement and >>> expression >>> switches, and talked about having a way to make statement switches >>> exhaustive. >>> That's still on the table, and we should still address this, but that's not >>> the subject of this mail. >>> What I want to focus on in this mail is the interplay between exhaustiveness >>> analysis and (exhaustive) switch semantics, and what code we have to inject >>> to >>> make up for gaps. We've identified two sources of gaps: nulls, and novel >>> enum >>> values. When we get to sealed types, we can add novel subtypes to the list >>> of >>> things we have to detect and implicitly reject; when we get to >>> deconstruction >>> patterns, we need to address these at nested levels too. >>> Let's analyze switches on Container<Shape> assuming: >>> Container<T> = Box<T> | Bag<T> >>> Shape = Rect | Circle >>> and assume a novel shape Pentagon shows up unexpectedly via separate >>> compilation. >>> If we have a switch _statement_ with: >>> case Box(Rect r) >>> case Box(Circle c) >>> case Bag(Rect r) >>> case Bag(Circle c) >>> then the only value we implicitly handle is null; everything else just >>> falls out >>> of the switch, because they don't try to be exhaustive. >>> If this is an expression switch, then I think its safe to say: >>> - The switch should deemed exhaustive; no Box(null) etc cases needed. >>> - We get an NPE on null. >>> But that leaves Box(null), Bag(null), Box(Pentagon), and Bag(Pentagon). We >>> have >>> to do something (the switch has to be total) with these, and again, asking >>> users to manually handle these is unreasonable. A reasonable strawman here >>> is: >>> ICCE on Box(Pentagon) and Bag(Pentagon) >>> NPE on Box(null) and Bag(null) >>> Essentially, what this means is: we need to explicitly consider null and >>> novel >>> values/types of enum/sealed classes in our exhaustiveness analysis, and, if >>> these are not seen to be explicitly covered and the implicit coverage plays >>> into the conclusion of overall weak totality, then we need to insert >>> implicit >>> catch-alls for these cases. >>> If we switch over: >>> case Box(Rect r) >>> case Box(Circle c) >>> case Box b >>> case Bag(Rect r) >>> case Bag(Circle c) >>> then Box(Pentagon) and Box(null) are handled by the `Box b` case and don't >>> need >>> to be handled by a catch-all. >>> If we have: >>> case Box(Rect r) >>> case Box(Circle c) >>> case Bag(Rect r) >>> case Bag(Circle c) >>> default >>> then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default >>> case, so no special handling is needed there. >>> Are we in agreement on what _should_ happen in these cases? If so, I can >>> put a >>> more formal basis on it. >>> On 8/14/2020 1:20 PM, Brian Goetz wrote: >>> - Exhaustiveness and null. (Tagir) For sealed domains (enums and sealed >>> types), >>> we kind of cheated with expression switches because we could count on the >>> switch filtering out the null. But Tagir raises an excellent point, which >>> is >>> that we do not yet have a sound definition of exhaustiveness that scales to >>> nested patterns (do Box(Rect) and Box(Circle) cover Box(Shape)?) This is an >>> interaction between sealed types and patterns that needs to be ironed out. >>> (Thanks Tagir!) >>> [ Breaking this out from Tagir's more comprehensive reply ] >>> It's unclear for me how exhaustiveness on nested patterns plays with >>> null. case Box(Circle c) and case Box(Rect r) don't cover case >>> Box(null) which is a valid possibility for Box<Shape> type. >>> It’s not even clear how exhaustiveness plays with null even without >>> nesting, so >>> let's start there. >>> Consider this switch: >>> switch (trafficLight) { >>> case GREEN, YELLOW: driveReallyFast(); >>> case RED: sigh(); >>> } >>> Is it exhaustive? Well, we want to say yes. And with the existing >>> null-hostility of switch, it is. But even without that, we’d like to say >>> yes, >>> because a null enum value is almost always an error, and making users deal >>> with >>> cases that don’t happen in reality is kind of rude. >>> For a domain sealed to a set of alternatives (enums or sealed classes), >>> let’s >>> say that a set of patterns is _weakly exhaustive_ if it covers all the >>> alternatives but not null, and _strongly exhaustive_ if it also covers null. >>> When we did switch expressions, we said that weakly exhaustive coverings >>> didn’t need a default in a switch expression. I think we’re primed to say >>> the >>> same thing for sealed classes. But, this “weak is good enough” leans on the >>> fact that the existing hostility of switch will cover what we miss. We get >>> no >>> such cover in nested cases. >>> I think it’s worth examining further why we are willing to accept the weak >>> coverage with enums. Is it really that we’re willing to assume that enums >>> just >>> should never be null? If we had type cardinalities in the language, would >>> we >>> treat `enum X` as declaring a cardinality-1 type called X? I think we >>> might. >>> OK, what about sealed classes? Would the same thing carry over? Not so >>> sure >>> there. And this is a problem, because we ultimately want: >>> case Optional.of(var x): >>> case Optional.empty(): >>> to be exhaustive on Optional<T>, and said exhaustiveness will likely lean on >>> some sort of sealing. >>> This is related to Guy's observation that totality is a "subtree all the way >>> down" property. Consider: >>> sealed class Container<T> permits Box, Bag { } >>> sealed class Shape permits Rect, Circle { } >>> Ignoring null, Box+Bag should be exhaustive on container, and Rect+Circle >>> should >>> be exhaustive on shape. So if we are switching over a Container<Shape>, >>> then >>> what of: >>> case Box(Rect r): >>> case Box(Circle c): >>> case Bag(Rect r): >>> case Bag(Circle c): >>> We have some "nullity holes" in three places: Box(null), Bag(null), and null >>> itself. Is this set of cases exhaustive on Bags, Boxes, or Containers? >>> I think users would like to be able to write the above four cases and treat >>> it >>> as exhaustive; having to explicitly provide Box(null) / Box b, Bag(null) / >>> Bag >>> b, or a catch-all to accept null+Box(null)+Bag(null) would all be deemed >>> unpleasant ceremony. >>> Hmm...