Re: Exploring inference for sealed types

Brian Goetz Tue, 24 Sep 2019 11:35:15 -0700

Thanks Gavin. We had an internal discussion on this today, which I willsummarize here to help illuminate the issue.

As Brian mentioned in an earlier email, sealed types address two related, but
distinct, issues: (1) declaring a sum type, whereby the compiler can exploit
exhaustiveness in various places (e.g. in a switch); and (2) defining a type
that for clients behaves as if it is final (it cannot be extended), but for
the class author actually has a fixed, known collection of implementations.

It would only be a moderate exaggeration to say that this is really twofeatures -- which brings us to the classic lump/split decision -- shouldwe expose this as one feature, or two. (In the past, Alan M proposed asplitting where the sum-hierarchy case looked more like an enumdeclaration, for example.) We have been leaning to "one", but there isa risk that attempting to cover both cases makes each of them harder tounderstand.

The second use dictates a subtle design constraint: a type that directly
extends/implements a `sealed` type must be either `sealed`, `final` or
`non-sealed`. If not, it will be too easy to create a security hole where the
class author intended a class hierarchy to be closed, but by forgetting a
modifier at a leaf type, inadvertently renders the hierarchy open.

This is our line in the sand; it would not be OK to have an arbitrarysubtype `class X implements I { }` for some sealed I, and have X end upbeing open for extension.

One valid design point is to stop here. All `sealed`/`non-sealed`/`final`
modifiers and `permits` clauses have to be given explicitly. The compiler then
just checks that what has been declared is correct.

Indeed, we could stop here; let's call this our baseline. If we stoppedhere, we'd get the desired safety benefits, explicitness, and areasonable lumping. Let's be more explicit about why we might do more.

If we declare a sum hierarchy, there are three unfortunate bits of O(n)repetition:


    sealed interface X permits A, B, C {
        final class A implements X {}
        final class B implements X {}
        final class C implements X {}
}

These three bits are: listing the subtypes twice (once at thedeclaration, once at the permits clause); saying "implements X"repeatedly; and saying "final" repeatedly. (In the event the subtypesare records, the last is automatically taken care of.)

We have been exploring some alternative design points, all supporting some
sort of inference.

The inference scheme Gavin proposed addresses the first and last ofthese, at some cost (both to perceived complexity and implementation.) Let's drill into why we proposed this in the first place.

The case I have in mind -- which I believe will be quite common -- is aflat hierarchy (one sealed supertype, N direct subtypes) with arelatively high degree of fan-out. (This shows up in all sorts ofdocument tree representations.) And of the three repetitions, my claimis the most irritating is the permits clause. Imagine the abovehierarchy fanned out to A-Z; there's a 26-way permits clause that isboth annoying to write and not particularly enlightening to read (and asa bonus, error-prone to update.) If we're going to infer anything, weshould start here.

So one sensible increment atop the baseline is: (A) if a top-level typeis explicitly declared sealed, and has no permits clause, we can inferthe permits clause from the subtypes in the compilation unit (or morenarrowly, the _nested_ subtypes). This is simpler than the full schemeoutlined, while addressing the biggest case that concerns me -- thehigh-fanout case.

The above scheme could be incrementally extended to (B) any classexplicitly marked `sealed` -- if you say sealed and leave off `permits`,the permits list is inferred from the current compilation unit. Thisseems defensible.

Where we end up with confusing action-at-a-distance is to infer finality/ sealed-ness for subtypes. We could back off from this completely, orwe could take a simpler projection, which is (C) to say any directsubtype of a sealed type is implicitly final, unless it explicitly says`sealed` or `non-sealed`.

So, if we're lumping, we could choose "baseline", or "baseline + A", or"baseline + A + B", or "baseline + A + B + C". All seem defensible,though baseline-only seems likely to provide some ongoing irritation forthe permits clause.

If we go the split direction, we have more choices, but having chased afew of these down, they all seem to arrive at muddy places. Alan's`enum class` approach looks clean when the components of the sum aresimple records, but when they are more complex classes, the expressionstarts to get ugly.

Another direction is to borrow the terminology (but not the semantics)from `case` classes in Scala, where we explicitly mark the subtypes,which has the effect of turning on all the complex inference, but atleast making it less magic:


    sealed interface I {
        case class A { ... }
        case class B { ... }
        case class C { ... }
    }

where we'd infer the proper sealed-ness/finality, permits clauses, andimplements clauses. But, I think when we get beyond toy examples, thisapproach feels unlikely to offer sufficient benefit to justify itself,and the toy examples work reasonably well already (since records arealready final.)

So my suggestion is to start with Baseline + (A | A&B), limitinginference to permits clauses, and see if that is enough.

Re: Exploring inference for sealed types

Reply via email to