I like this new analysis. See below. > On Aug 26, 2020, at 1:01 PM, Brian Goetz <brian.go...@oracle.com> wrote: > > I think we now have a sound story for totality in both patterns and switches. > Let's start with refining what we mean by totality. > > We have seen a lot of cases -- and not just those involving enums or sealed > types -- where we want to say a pattern or set of patterns is "total enough" > to not force the user to explicitly handle the corner cases. In these cases, > we will let the compiler handle the corner cases by generating exceptions. > > A prime example is the deconstruction pattern Foo(var x); this matches all > Foos, but not null. Similarly, there is a whole family of such corner cases: > Foo(Bar(var x)) matches all Foos, except for null and Foo(null). But we are > in agreement that it would be overly pedantic to force the user to handle > these explicitly. > > Note that these show up not only in switch, but in pattern assignment: > > Object o = e; // Object o is a total pattern on typeof(e), so always > succeeds > Foo(var x) = foo // Total on Foo, except null, so should NPE on null > Foo(Bar(var x)) = foo // Total on Foo, except null and Foo(null), throw > on these > > var y = switch (foo) { > case Foo(var x) -> bar(x); // Total on Foo, except null, so NPE on > null > } > > These goals come from a pragmatic desire to not pedantically require the user > to spell out the residue, but to accept the pattern (or set of patterns) as > total anyway. > > > Now, let's put this intuition on a sounder footing by writing some formal > rules for saying that a pattern P is _total on T with remainder R_, where R > is a _set of patterns_. We will say P is _strongly total on T_ if P is total > on T with empty remainder. > > The intuition is that, if P is total on T with remainder R, the values > matched by R but not by P are "silly" values and a language construct like > switch can (a) consider P sufficient to establish totality and (b) can insert > synthetic tests for each of the patterns in R that throw. > > We will lean on this in _both_ switch and pattern assignment. For switch, we > will treat it as if we insert synthetic cases after P that throw, so that the > remaining values can still be matched by earlier explicit patterns. > > Invariant: If P is total on T with remainder R then, for all t in T, either t > matches P, or t matches some pattern in R. (This is not the definition of > what makes a pattern total; it is just something that is true about total > patterns.) > > Base cases: > > - The type pattern `T t` is strongly total on U <: T. > - The inferred type pattern `var x` is strongly total on all T. > > Induction cases: > > - Let D(T) be a deconstructor. > - The deconstruction pattern D(Q), where Q is strongly total on T, is > total on D with remainder { null }. > - The deconstruction pattern D(Q), where Q is total on T with remainder > R*, is total on D with remainder { null } union { D(R) : R in R* }
Note that the first sub-bullet is “merely” an important special case of the second sub-subbullet. > We can easily generalize the definition of totality to a set of patterns. In > this case, we can handle sealed types and enums: > > - Let E be an enum type. The set of patterns { C1 .. Cn } is total on E > with remainder { null, E e } if C1 .. Cn contains all the constants of E. > > Observation: that `E e` pattern is pretty broad! But that's OK; it captures > all novel constant values, and, because the explicit cases cover all the > known values, captures only the novel values. Same for sealed types: > > - Let S be a sealed abstract type or interface, with permitted direct > subtypes C*, and P* be a set of patterns applicable to S. If for each C in > C*, there exists a subset Pc of P* that is total on C with remainder Rc, then > P* is total on S with remainder { null } union \forall{c \in C}{ Rc }. I think there should not be set braces around that last occurrence of “Rc”. > - Let D(T) be a deconstructor, and let P* be total on T with remainder R*. > Then { D(P) : P in P* } is total on D with remainder { null } \union { D(R) : > R in R* }. > > Example: > Container<T> = Box<T> | Bag<T> > Shape = Circle | Rect > P* = { Box(Circle), Box(Rect), Bag(Circle), Bag(Rect) } I assume that this use of syntax “T = U | V” is meant to imply that T is a sealed type that permits U and V. > { Circle, Rect } total on Shape with remainder { Shape, null } > > -> { Box(Circle), Box(Rect) total on Box<Shape> with remainder { Box(Shape), > Box(null), null } > > -> { Bag(Circle), Bag(Rect) total on Bag<Shape> with remainder { Bag(Shape), > Bag(null), null } > > -> P* total on Container<Shape> with remainder { Container(Box(Shape)), > Container(Box(null)), Container(Bag(Shape)), Container(Box(Shape)), > Container(null), null } I believe that, in this last remainder, the second occurrence of `Container(Box(Shape))` was intended to be `Container(Bag(null))`. But I also think that the phrase `Container(Box(Shape))` is inconsistent or incoherent; there has been some confusion of the _type parameter_ of `Container` with a _deconstruction parameter_ of `Container`. To say `Container(Box(…))` is as silly as to say `Shape(Rect)`. I will try to redo this derivation while being very explicit about the type parameters: Container<T> = Box<T> | Bag<T> Shape = Circle | Rect P* = { Box<Shape>(Circle), Box<Shape>(Rect), Bag<Shape>(Circle), Bag<Shape>(Rect) } { Box<T>(T), Bag<T>(T) } total on Container<T> with remainder { Container<T>, null } Now instantiate the previous rule with T=Shape to get { Box<Shape>(Shape), Bag<Shape>(Shape) } total on Container<Shape>(Shape) with remainder { Container<Shape>(Shape), null } { Circle, Rect } total on Shape with remainder { Shape, null } -> { Box<Shape>(Circle), Box<Shape>(Rect) total on Box<Shape> with remainder { Box<Shape>(Shape), Box<Shape>(null), null } -> { Bag<Shape>(Circle), Bag<Shape>(Rect) total on Bag<Shape> with remainder { Bag<Shape>(Shape), Bag<Shape>(null), null } -> P* total on Container<Shape> with remainder { Box<Shape>(Shape), Box<shape>(null), Bag<Shape>(Shape), Bag<Shape>(null), Container<Shape>, null } > Now: > > - A pattern assignment `P = e` requires that P be total on the static type > of `e`, with some remainder R, and throws on the remainder. > - A total switch on `e` requires that its cases be total on the static type > of `e`, with some remainder R, and inserts synthetic throwing cases at the > end of the switch for each pattern in R. Yes! To which we can perhaps add: a pattern `instanceof` expression `x instanceof P` requires that the pattern P _not_ be total on the type of x. > We can then decide how to opt into totality in switches, other than "be an > expression switch.” Yes!