> De: "Brian Goetz" <brian.go...@oracle.com> > À: "Alan Malloy" <amal...@google.com> > Cc: "amber-spec-experts" <amber-spec-experts@openjdk.java.net> > Envoyé: Lundi 29 Avril 2019 23:01:17 > Objet: Re: Feedback on Sealed Types
> It would be nice if we could "just" overload enum itself to support a > record-like option: > enum Node { > AddNode(Node a, Node b), > MulNode(Node a, Node b), > ...; > ... > } > but unfortunately that syntax looks confusingly close to something else :( It > also doesn't really scale to multi-level hierarchies, though that might be OK. > Its worth thinking about, though. The `data` construct from Haskell is surely > more direct than modeling a sum with sealed interfaces (though, the latter is > also more flexible than `data`.) > On the other hand, declaring a sum of records as: > sealed interface I { > record A(int a) implements I { } > record B(long b) implements I { } > record C(String c) implements I { } > } > isn't so bad. The main redundancy here is primarily "implements I", and > secondarily "record". We can surely compress that away like this: > enum interface I { > A(int a), > B(long b), > C(String c); > // I methods > } > but I am not sure it carries its weight, given as it adds little additional > concision (and no additional semantics.) It may solve the enclosing issue because the ';' syntactically separate A, B and C from the content of I which is declared after the ';', so A, B and C can be top-level. I kind a like the intellectual separation between - a sealed interface which represent a closed type and requires a permit clause and - an enum interface which represent a sum type which is sugar on top of sealed interface + records. one interesting question is how to desugar an enum interface with a component that has no parameter, like enum interface Option<T> { Some(T value), Empty } If there is only one constant of type Empty and the construction is typesafe, it can be a huge win. Rémi > On 4/29/2019 4:33 PM, Alan Malloy wrote: >> Thanks, Brian. I indeed didn't think of some of your proposed benefits of >> sealing non-sum types, as I was focused mostly on things you mentioned >> explicitly in the JEP, which is somewhat light on the expected benefits. >> I think the first two items in your "challenges" solve each other: I don't >> intend sum types to be the only kind of sealed type, but just a good way to >> declare the simplest kind. I left out the "record" keyword from the >> declaration >> with the idea that it would be implicit: if you want the convenient >> sum-of-products declaration style, you have to use records. If you want >> something more complicated, you declare a sealed interface (or superclass), >> and >> N permitted subclasses, declared separately in whatever way you want. This >> restriction helps by making the semantics clearer, and I had also hoped that >> it >> would lead to a syntax error if you leave out the comma. Looking more >> closely, >> I see this is somewhat precarious: a record declaration looks enough like a >> method signature that they may be ambiguous in an interface, if you don't >> require the "record" keyword, or if you use a semicolon instead of a comma. I >> think it can still work if we require each nested record to use {...} instead >> of ; even if it's empty. This way, your two examples look like >> interface X { >> class X1 { … } >> class X2 { … } >> } >> and >> enumerated interface Y { >> Y1 { … }, <— don’t forget this comma! >> Y2 { … } >> } >> The latter would become illegal if you dropped the comma, even if you also >> forgot the "enumerated" keyword, because the braces make no sense in an >> ordinary interface. >> On Mon, Apr 29, 2019 at 12:35 PM Brian Goetz < [ >> mailto:brian.go...@oracle.com | >> brian.go...@oracle.com ] > wrote: >>> Thanks Alan, for this nice exploration. There’s a lot to respond to. I’ll >>> start >>> with some general comments about sealing, and then move on to your alternate >>> proposal for exposing it. >>> I can think of several main reasons why you would want to seal a hierarchy. >>> - To say something about the _interface itself_. That is, that it was not >>> designed as a general purpose contract between arms-length entities, but >>> that >>> it exists only as a common super type for a fixed set of classes that are >>> part >>> of the library. In other words, “please don’t implement me.” >>> - To say something about the semantics of the type. Several of the examples >>> in >>> your report fall into this category: “a DbResult is either a NoRowsFound or >>> a >>> Rows(List<Row>)”. This tells users exactly what the reasonable range of >>> results >>> to expect are when doing a query. Of course, the spec could say the same >>> thing, >>> but that involves reading and interpreting the spec. Easier if this >>> conclusion >>> can be driven by types (and IDEs can help more here too.) >>> - To strengthen typing by simulating unions. If my method is going to return >>> either a String or a Number, the common super type is Object. (Actually, >>> it’s >>> some variant of Serializable & Comparable<? extends….>.). Sums-of-products >>> allow library authors to make a stronger statement about types in the >>> presence >>> of unions. Exposing a sum of StringHolder(String) and NumberHolder(Number), >>> using records and sealed types, is not so ceremonious, so some library >>> developers might choose to do this instead of Object. >>> - Security. Some libraries may want to know that the code they are calling >>> is >>> part of their library, rather than an arbitrary implementation of some >>> interface. >>> - To aid in exhaustiveness. We’ve already discussed this at length; your >>> point >>> is that this one doesn’t come up as often as one might hope. >>> Not only is there an obvious synergy between sums and products (as many >>> languages have demonstrated), but there is a third factor, which is “if you >>> make it easy enough, people will use it more.” Clearly records are pretty >>> easy >>> to use; your point is that if there were a more streamlined sum-of-products >>> idiom, the third factor would be even stronger here. I think algebraic data >>> types is one of those things that will take some time for developers to >>> learn >>> to appreciate; the easier we make it, of course the faster that will happen. >>> Now, to your syntax suggestion. Overall, I like the idea, but I have some >>> concerns. First, the good parts: >>> - The connection with enums is powerful. Users already understand enums, so >>> this >>> will help them understand sums. Enums have special treatment in switch; we >>> want >>> the same treatment for sealed type patterns. Enums have special treatment >>> for >>> exhaustiveness; we want the same for sealed type patterns. So tying these >>> together with some more general enum-ness leans on what people already know. >>> - While sums and products are theoretically independent features, >>> sums-of-products are expected to be quite common. So it might be reasonable >>> to >>> encourage this syntactically. >>> - The current proposal has some redundancy, in that the subtypes have to say >>> “implements Node”, even if they are nested within Node. With a stronger >>> mechanism for declaring them, as you propose, then that can safely be left >>> implicit. >>> - I confess that I too like the simplicity of Haskell’s `data` declaration, >>> and >>> this brings us closer. >>> Now, the challenges: >>> - The result is still a little busy. We need a modifier for “enumerated >>> type”, >>> and we would also need to be able to have child types be not only records, >>> but >>> ordinary classes and interfaces. So we’d have to have a place for “record”, >>> “class”, or “interface” with the declaration of the enumerated classes (as >>> well >>> as other modifiers.). That busies up the result a bit. >>> - Once we do this, I worry that it will be hard to tell the difference >>> between: >>> interface X { >>> class X1 { … } >>> class X2 { … } >>> } >>> and >>> enumerated interface Y { >>> class Y1 { … }, <— don’t forget this comma! >>> class Y2 { … } >>> } >>> and that users will forever be making mistakes like forgetting the comma, or >>> putting it where it doesn’t belong. >>> - This mechanism addresses the very common case of sum-of-product, but >>> leaves >>> more esoteric sums out of the picture. (Consider the types in >>> java.lang.constant, which really want to be sealed.). There, because they >>> are >>> not co-declared, we’d need something more like >>> sealed interface ConstantDesc >>> permits ClassDesc, MethodTypeDesc, …. { } >>> It's possible that such a mechanism can be grafted on to your proposal, or >>> there >>> is a shuffling that supports it. >>>> On Apr 29, 2019, at 2:28 PM, Alan Malloy < [ mailto:amal...@google.com | >>>> amal...@google.com ] > wrote: >>>> Hello again, amber-spec-experts. I have another report from the Google >>>> codebase, >>>> this time focusing on sealed types. It is viewable in full Technicolor >>>> HTML at >>>> [ http://cr.openjdk.java.net/~cushon/amalloy/sealed-types-report.html | >>>> http://cr.openjdk.java.net/~cushon/amalloy/sealed-types-report.html ] >>>> (thanks >>>> again to Liam for hosting), and included below as plain text: >>>> Author: Alan Malloy ( [ mailto:amal...@google.com | amal...@google.com ] ) >>>> Published: 2019-04-29 >>>> Feedback on Sealed Types >>>> Hello again, amber-spec-experts. I’m back with a second Google codebase >>>> research >>>> project. I’m looking again at the Records & Sealed Types proposal (which >>>> has >>>> now become JDK-8222777), but this time I’m focusing on sealed types >>>> instead of >>>> records, as promised in my RFC of a few weeks ago. My goal was to >>>> investigate >>>> Google’s codebase to guess what developers might have done differently if >>>> they >>>> had access to sealed types. This could help us understand what works in the >>>> current proposal and what to consider changing. >>>> Unlike my previous report, this one contains more anecdotes than >>>> statistics. It >>>> wound up being difficult to build static analysis to categorize the >>>> interesting >>>> cases, so I mostly examined promising candidates by hand. >>>> Summary and Recommendations >>>> For those who don’t care to read through all my anecdotes, I first provide >>>> a >>>> summary of my findings, and one suggested addition. >>>> Sealed types, as proposed so far, are a good idea in theory: Java already >>>> has >>>> product types and open polymorphism, and sealed types give us closed >>>> polymorphism. However, I could not find many cases of code being written >>>> today >>>> that would be greatly enhanced if sealed types were available. The main >>>> selling >>>> point of sealed types for application authors is getting help from the >>>> compiler >>>> with exhaustiveness checking, but in practice developers almost always >>>> have a >>>> default case, because they are only interested in a subset of the possible >>>> subclasses, and want to ignore cases they don’t understand. This means that >>>> exhaustiveness-checking for pattern matches would mostly go unused if >>>> developers rewrote their existing code using sealed types. >>>> Pattern matching is great, and can replace visitors in many cases, but >>>> this does >>>> not depend on sealed types except for exhaustiveness checks (which, again, >>>> would go mostly unused in code written today). The class hierarchies for >>>> which >>>> people define visitors today are just too large to write an exhaustive >>>> pattern >>>> match, and so a default case would be very common. >>>> The other audience for sealed types is library authors. While in practice >>>> most >>>> developers have no great need to forbid subclasses, perhaps it would be a >>>> boon >>>> for authors of particularly popular libraries, who need to expose a >>>> non-final >>>> class as an implementation detail but don’t intend for consumers to create >>>> their own subclasses. Those authors can already include documentation >>>> saying >>>> “you really should not extend this”, but there is always some weirdo out >>>> there >>>> who will ignore your warnings and then write an angry letter when the next >>>> version of your library breaks his program (see: sun.misc.Unsafe). Authors >>>> of >>>> such libraries would welcome the opportunity to make it truly impossible to >>>> create undesirable subclasses. >>>> Sealed Types As a Vehicle For Sum Types >>>> So, sealed types as-is would be an improvement, but a niche one, used by >>>> few. I >>>> think we can get substantially more mileage out of them if we also include >>>> a >>>> more cohesive way to explicitly define a sum type and all its subtypes in >>>> one >>>> place with minimal ceremony. Such a sum type could be sealed, implicitly or >>>> explicitly. A tool like this takes what I see as the “theoretical” >>>> advantage of >>>> sum types (closed polymorphism), and makes it “practical” by putting it >>>> front >>>> and center. Making sums an actual language element instead of something >>>> “implied” by sealing a type and putting its subclasses nearby could help >>>> in a >>>> lot of ways: >>>> * Developers might more often realize that a sealed/sum type is a good >>>> model for >>>> their domain. Currently it’s a “pattern” external to the language instead >>>> of a >>>> “feature”, and many don’t realize it could be applied to their domain. >>>> Putting >>>> it in the language raises its profile, addressing the problem that people >>>> don’t >>>> realize they want it. >>>> * The compiler could provide help for defining simple sums-of-products, >>>> while >>>> making it possible to opt into more complicated subclasses, in much the way >>>> that enums do: the typical enum just has bare constants like EAST, but you >>>> can >>>> add constructor arguments or override methods when necessary. >>>> * The ability to more easily model data in this way may result in >>>> developers >>>> writing more classes that are amenable to sealing/sums, as they do in other >>>> languages with explicit sum types (Haskell, Kotlin, Scala). Then, the >>>> exhaustiveness-checking feature that sealed types provide would pull more >>>> weight. >>>> Since enum types are “degenerate sum types”, the syntax for defining sums >>>> can >>>> borrow heavily from enums. A sketch of the syntax I imagine for such >>>> things (of >>>> course, I am not married to it): >>>> public type-enum interface BinaryTree<T> { >>>> Leaf { >>>> @Override public Stream<T> elements() {return Stream.empty();} >>>> }, >>>> Node(T data, BinaryTree<T> left, BinaryTree<T> right) { >>>> @Override public Stream<T> elements() { >>>> return Stream.concat(left.elements(), >>>> Stream.concat(Stream.of(data), right.elements())); >>>> } >>>> }; >>>> public Stream<T> elements(); >>>> } >>>> Like enums, you can use a bare identifier for simple types that exist only >>>> to be >>>> pattern-matched against, but you can add fields and/or override blocks as >>>> necessary. The advantage over declaring a sealed type separately from its >>>> elements is both concision (the compiler infers visible records, >>>> superclass, >>>> and all type parameters) and clarity: you state your intention firmly. I >>>> think >>>> a convenient syntax like this will encourage developers to use the powerful >>>> tool of sealed types to model their data. >>>> Evidence in Google’s Codebase >>>> If you are just interested in recommendations, you can stop reading now: >>>> they >>>> are all included in the summary. What follows is a number of anecdotes, or >>>> case >>>> studies if you prefer, that led me to the conclusions above. Each shows a >>>> type >>>> that might have been written as a sealed type, and hopefully highlights a >>>> different facet of the question of how sealed types can be useful. >>>> The first thing I looked for was classes which are often involved in >>>> instanceof >>>> checks. As language authors, we imagine people writing stuff like this[1] >>>> all >>>> the time: >>>> interface Expr {int eval(Scope s);} >>>> record Var(String name) implements Expr { >>>> public int eval(Scope s) {return s.get(name);} >>>> } >>>> record Sum(Expr left, Expr right) implements Expr { >>>> public int eval(Scope s) {return left.eval(s) + right.eval(s);} >>>> } >>>> class Analyzer { >>>> Stream<String> variablesUsed(Expr e) { >>>> if (e instanceof Var) return Stream.of(((Var)e).name); >>>> if (e instanceof Sum) { >>>> return variablesUsed(((Sum)e).left) >>>> .concat(variablesUsed(((Sum)e).right)); >>>> } >>>> throw new IllegalArgumentException(); >>>> } >>>> } >>>> Here, the Expr interface captures some of the functionality shared by all >>>> expressions, but later a client (Analyzer) came along and invented some >>>> other >>>> polymorphic operations to perform on an Expr, which Expr did not support. >>>> So >>>> Analyzer needed to do instanceof checks instead, externalizing the >>>> polymorphism. The principled approach would have been for Expr to export a >>>> visitor to begin with, but perhaps it wasn’t seen as worth the trouble at >>>> the >>>> time. >>>> To try to find this pattern in the wild, I searched for method bodies which >>>> perform multiple instanceof checks against the same variable. Notably, this >>>> excludes the typical equals(Object) method, which only performs a single >>>> check. >>>> For each such variable, I noted: >>>> 1. Its declared type >>>> 2. The set of subtypes it was checked for with instanceof >>>> 3. The common supertype of those subtypes. >>>> I guessed that (3) would usually be the same as (1), but in practice 55% >>>> of the >>>> time they were different. Often, the declared type was Object, or some >>>> generic >>>> type variable which erases to Object, while the common supertype being >>>> tested >>>> was something like Number, Event, or Node. For example, a Container<T> >>>> knows it >>>> will be used in some context where NaN is unsuitable, so it checks whether >>>> its >>>> contents are Float or Double, and if so ensures NaN is not stored. As a >>>> second >>>> example, a serialize(Object) method checks whether its input is String or >>>> ByteString, and throws an exception otherwise. >>>> Bad sealed types found looking at instanceof checks >>>> I looked through the most popular declared types of these candidates, to >>>> investigate which types are often involved in such checks. Most of them >>>> are not >>>> good candidates for a sealed type. Object was the most common type, >>>> followed by >>>> Exception and Throwabe. >>>> Next up is an internal DOMObject class, which sounds promising until I >>>> tell you >>>> it has thousands of direct subclasses. Nobody is doing exhaustive switches >>>> on >>>> this, of course. Instead, many uses iterate over a Collection<DOMObject>, >>>> or >>>> receive a DOMObject in some way, and just check whether it is of one or two >>>> specific subtypes they care about. This turned out to be a very common >>>> pattern, >>>> not just for DOMObject, but for many candidate sealed types I found: nobody >>>> does exhaustive case analysis. They just look for objects they understand >>>> in >>>> some much larger hierarchy, and ignore the rest. >>>> Some more humorous types that are often involved in instanceof checks: >>>> java.net.InetAddress (everyone wants to know if it’s v4 or v6) and >>>> com.sun.source.tree.Tree, in our static-analysis tools. Tree is an >>>> interesting >>>> case: here we do exactly what I mentioned previously for DOMObject. On the >>>> surface it seems that Tree would be a good candidate for a sealed interface >>>> with record subtypes, but in practice I’m not sure what sealing would buy >>>> us. >>>> We would effectively opt out of exhaustiveness-checking by having a large >>>> default case, or by extending a visitor with default-empty methods. Of >>>> course, >>>> sometimes we define a new visitor to do some polymorphic operation over a >>>> Tree, >>>> but more often we just look for one or two subtypes we care about. For >>>> example, >>>> DataFlow inspects a Tree, but knows from context that it is either a >>>> LambdaExpressionTree, MethodTree, or an initializer. >>>> Plausible sealed types found looking at instanceof checks >>>> The previous section notwithstanding, I did dig deep enough into the >>>> results to >>>> find a few classes that could make good sealed types. The most prominent, >>>> and >>>> most interesting, was another AST. There is an abstract Node class for >>>> representing HTML documents. It has just 4 subclasses defined in the same >>>> file: >>>> Text, Comment, Tag, and EndTag. This spartan definition suggests it’s used >>>> for >>>> something like SAX parsing, but I didn’t confirm this. It does everything >>>> you >>>> could hope for from a type like this: it exposes a Visitor, it provides an >>>> accept(Visitor) method, and the superclass specifies abstract methods for a >>>> couple of the most common things you would want to do, such as a String >>>> toHtml() method. >>>> However, recall that I found this class by looking for classes often >>>> involved in >>>> instanceof checks! Some people use the visitor, but why doesn’t everyone? >>>> The >>>> first reason I found is one I’ve mentioned several times already: clients >>>> only >>>> care about one of the 4 cases, and may have felt creating an anonymous >>>> visitor >>>> is too much ceremony. Would they be happy with a switch and a default >>>> clause? >>>> Probably, but it’s hard to know for sure. The second reason surprised me a >>>> bit: >>>> I found clients doing analysis that isn’t really amenable to any one >>>> visitor, >>>> or a simple pattern-match. They’ve written this: >>>> if (mode1) { if (x instanceof Tag) {...} } >>>> else if (mode2) { if (x instanceof Text) {...}} >>>> The same use site cares about different subclasses at different times, >>>> depending >>>> on some other flag(s) controlling its behavior. Even if we offered a >>>> pattern-match on x, it’s difficult to encode the flags correctly. They >>>> would >>>> have to match on a tuple of (mode1, mode2, x), with a case for (true, _, >>>> Tag >>>> name) and another for (false, true, Text text). Technically possible, but >>>> not >>>> really prettier than what they already have, especially since you would >>>> need to >>>> use a local record instead of an anonymous tuple. >>>> Even so, I think this would have benefited from being a sealed type. >>>> Recall that >>>> earlier I carefully said “4 subclasses defined in the same file”. This is >>>> because some jokester in a different package altogether has defined their >>>> own >>>> fifth subclass, Doctype. They have their own sub-interface of Visitor that >>>> knows about Doctype nodes. I can’t help but feel that the authors of Node >>>> would >>>> have preferred to make this illegal, if they had been able to. >>>> The second good sealed type I found is almost an enum, except that one of >>>> the >>>> instances has per-instance data. This is not exactly a surprise, since an >>>> enum >>>> is a degenerate sum type, and one way to think of sealed types is as a way >>>> to >>>> model sums. It looks something like this[2]: >>>> public abstract class DbResult<T> { >>>> public record NoDatabase<T>() extends DbResult<T>; >>>> public record RowNotFound<T>() extends DbResult<T>; >>>> // Four more error types ... >>>> public record EmptySuccess<T>() extends DbResult<T>; >>>> public record SuccessWithData<T>(T data) extends DbResult<T>; >>>> public T getData() { >>>> if (!(this instanceof SuccessWithData)) >>>> throw new DbException(); >>>> return ((SuccessWithData<T>)this).data; >>>> } >>>> public <U> DbResult<U> transform(Function<T, U> f) { >>>> if (!(this instanceof SuccessWithData)) { >>>> return (DbResult<U>)this; >>>> } >>>> return new SuccessWithData<U>(f.apply( >>>> ((SuccessWithData<T>)this).data)); >>>> } >>>> Reading this code made me yearn for Haskell: here is someone who surely >>>> wanted >>>> to write >>>> data DbResult t = NoDatabase | NoRow | EmptySuccess | Success t >>>> but had to spend 120 lines defining their sum-of-products (the extra >>>> verbosity >>>> is because really they made the subclasses private, and defined private >>>> static >>>> singletons for each of the error types, with a static getter to get the >>>> type >>>> parameter right). This seems like a potential win for records and for >>>> sealed >>>> types. Certainly my snippet was much shorter than the actual source file >>>> because the proposed record syntax is quite concise, so that is a real >>>> win. But >>>> what do we really gain from sealing this type? Still nobody does exhaustive >>>> analysis even of this relatively small type: they just use functions like >>>> getData and transform to work with the result generically, or spot-check a >>>> couple interesting subtypes with instanceof. Forbidding subclassing from >>>> other >>>> packages hardly matters: nobody was subclassing it anyway, and nor would >>>> they >>>> be tempted to. Really the improvements DbResult benefits most from are >>>> records, >>>> and pattern-matching on records. It would be much nicer to replace the >>>> instanceof/cast pattern with a pattern-match that extracts the relevant >>>> field. >>>> This is the use case that inspired my idea of a type-enum, in the Summary >>>> section above. Rewriting it as a type-enum eliminates many of the >>>> problems: all >>>> the instanceof checks are gone, we don’t need a bunch of extra keywords for >>>> each case, and we’re explicit about the subclasses “belonging to” the >>>> sealed >>>> parent, which means we get stuff like extends and <T> for free. We get >>>> improved >>>> clarity by letting the definition of the class hierarchy reflect its >>>> “nature” >>>> as a sum. >>>> public abstract type-enum DbResult<T> { >>>> NoDatabase, >>>> RowNotFound, >>>> EmptySuccess, >>>> SuccessWithData(T data) { >>>> @Override public T getData() { >>>> return data; >>>> } >>>> @Override public <U> DbResult<U> transform(Function<T, U> f) { >>>> return new SuccessWithData<U>(f.apply(data)); >>>> } >>>> } >>>> public T getData() { >>>> throw new DbException(); >>>> } >>>> public <U> DbResult<U> transform(Function<T, U> f) { >>>> return (DbResult<U>)this; >>>> } >>>> } >>>> Visitors >>>> Instead of doing a bunch of instanceof checks, the “sophisticated” way to >>>> interact with a class having a small, known set of subtypes is with a >>>> visitor. >>>> I considered doing some complicated analysis to characterize what makes a >>>> class >>>> a visitor, and trying to automatically cross-reference visitors to the >>>> classes >>>> they visit...but in practice simply looking for classes with “Visitor” in >>>> their >>>> name was a strong enough signal that a more complicated approach was not >>>> needed. Having identified visitors, I looked at those visitors with the >>>> most >>>> subclasses, since each distinct subclass corresponds to one “interaction” >>>> with >>>> the sealed type that it visits, and well-used visitors suggest both >>>> popularity >>>> and good design. >>>> One common theme I found: developers aren’t good at applying the visitor >>>> pattern. Many cases I found had some weird and inexplicable quirk compared >>>> to >>>> the “standard” visitor. These developers will be relieved to get >>>> pattern-matching syntax so they can stop writing visitors. >>>> The Visiting Object >>>> The first popular visitor I found was a bit odd to me. It’s another tree >>>> type, >>>> but with a weird amalgam of several visitors, and an unusual approach to >>>> its >>>> double dispatch. I have to include a relatively lengthy code snippet to >>>> show >>>> all of its facets: >>>> public static abstract class Node { >>>> public interface Visitor { >>>> boolean process(Node node); >>>> } >>>> public boolean visit(Object v) { >>>> return v instanceof Visitor >>>> && ((Visitor)v).process(this); >>>> } >>>> // Other methods common to all Nodes ... >>>> } >>>> public static final class RootNode extends Node { >>>> public interface Visitor { >>>> boolean processRoot(RootNode node); >>>> } >>>> @Override >>>> public boolean visit(Object v) { >>>> return v instanceof Visitor >>>> ? ((Visitor)v).processRoot(this) >>>> : super.visit(v); >>>> } >>>> // Other stuff about root nodes ... >>>> } >>>> public static abstract class ValueNode extends Node { >>>> public interface Visitor { >>>> boolean processValue(ValueNode node); >>>> } >>>> @Override >>>> public boolean visit(Object v) { >>>> return v instanceof Visitor >>>> ? ((Visitor)v).processValue(this) >>>> : super.visit(v); >>>> } >>>> } >>>> public static final class BooleanNode extends ValueNode { >>>> public interface Visitor { >>>> boolean processBool(BooleanNode node); >>>> } >>>> @Override >>>> public boolean visit(Object v) { >>>> return v instanceof Visitor >>>> ? ((Visitor)v).processBool(this) >>>> : super.visit(v); >>>> } >>>> // Other stuff about booleans ... >>>> } >>>> public static final class StringNode extends ValueNode { >>>> // Much the same as BooleanNode >>>> } >>>> This goes on for some time: there is a multi-layered hierarchy of dozens >>>> of node >>>> types, each with a boolean visit(Object) method, and their own distinct >>>> Visitor >>>> interface, in this file. I should note that this code is actually not >>>> written >>>> by a human, but rather generated by some process (I didn’t look into how). >>>> I >>>> still think it is worth mentioning here for two reasons: first, whoever >>>> wrote >>>> the code generator would probably do something similar if writing it by >>>> hand, >>>> and second because these visitors are used often by hand-written code. >>>> Speaking of hand-written code, visitor subclasses now get to declare ahead >>>> of >>>> time exactly which kinds of nodes they care about, by implementing only the >>>> appropriate Visitor interfaces: >>>> private class FooVisitor implements StringNode.Visitor, >>>> BooleanNode.Visitor, RootNode.Visitor { >>>> // ... >>>> } >>>> This isn’t how I would have written things, but I can sorta see the >>>> appeal, if >>>> you don’t have to write it all by hand: a visitor can choose to handle any >>>> one >>>> subclass of ValueNode, or all ValueNodes, or just RootNode and StringNode, >>>> et >>>> cetera. They get to pick and choose what sub-trees of the inheritance tree >>>> they >>>> work with. >>>> Would Node be a good sealed class? Maybe. It clearly intends to enumerate >>>> all >>>> subclasses, but the benefit it gets from enforcing that is minimal. As in >>>> my >>>> previous examples, the main advantage for Node implementors would come from >>>> records, and the main advantage for clients would come from >>>> pattern-matching, >>>> obviating their need for this giant visitor. >>>> The Enumerated Node >>>> Another AST, this time for some kind of query language, explicitly >>>> declares an >>>> enum of all subclasses it can have, and uses this enum instead of using >>>> traditional double-dispatch: >>>> public interface Node { >>>> enum Kind {EXPR, QUERY, IMPORT /* and 9 more */} >>>> Kind getKind(); >>>> Location getLocation(); >>>> } >>>> public abstract record AbstractNode(Location l) implements Node {} >>>> public class Expr extends AbstractNode { >>>> public Kind getKind() {return EXPR;} >>>> // ... >>>> } >>>> // And so on for other Kinds ... >>>> public abstract class Visitor { >>>> // Empty default implementations, not abstract. >>>> public Expr visitExpr(Expr e) {} >>>> public Query visitQuery(Query q) {} >>>> public Import visitImport(Import i) {} >>>> public Node visit(Node n) { >>>> switch (n.getKind()) { >>>> case EXPR: return visitExpr((Expr)n); >>>> case QUERY: return visitQuery((Query)n); >>>> case IMPORT: return visitImport((Import)n); >>>> // ... >>>> } >>>> } >>>> } >>>> It’s not really clear to me why they do it this way, instead of putting an >>>> accept(Visitor) method on Node. They gain the ability to return different >>>> types >>>> for each Node subtype, but are hugely restricted in what visitors can do: >>>> they >>>> must return a Node, instead of performing an arbitrary computation. It >>>> seems >>>> like the idea is visitors must specialize to tree rewriting, but I still >>>> would >>>> have preferred to parameterize the visitor by return type. >>>> Would this be better as a sealed type? I feel sure that if sealed types >>>> existed, >>>> the authors of this class would have used one. We could certainly do away >>>> with >>>> the enum, and use an expression-switch instead to pattern-match in the >>>> implementation of visit(Node). But I think the Visitor class would still >>>> exist, >>>> and still have separate methods for each Node subtype, because they >>>> developer >>>> seemed to care about specializing the return type. The only place where an >>>> exhaustiveness check helps would be in the visit(Node) method, inside the >>>> visitor class itself. All other dispatch goes through visit(Node), or >>>> through >>>> one of the specialized visitor methods if the type is known statically. It >>>> seems like overall this would be an improvement, but again, the improvement >>>> comes primarily from pattern-matching, not sealing. >>>> Colocated interface implementations >>>> Finally, I looked for interfaces having all of their implementations >>>> defined in >>>> the same file. On this I do have some statistical data[3]. A huge majority >>>> (98.5%) of public interfaces have at least one implementation in a >>>> different >>>> source file. Package-private interfaces also tend to have implementations >>>> in >>>> other files: 85% of them are in this category. For protected interfaces >>>> it’s >>>> much closer: only 53% have external implementations. Of course, all private >>>> interfaces have all implementations in a single file. >>>> Next, I looked at interfaces that share a source file with all their >>>> implementations, to see whether they’d make good sealed types. First was >>>> this >>>> Entry class: >>>> public interface Entry { >>>> enum Status {OK, PENDING, FAILED} >>>> Status getStatus(); >>>> int size(); >>>> String render(); >>>> } >>>> public class UserEntry implements Entry { >>>> private User u; >>>> private Status s; >>>> public UserEntry(User u, Status s) { >>>> this.u = u; >>>> this.s = s; >>>> } >>>> @Override String render() {return [ http://u.name/ | u.name ] ();} >>>> @Override int size() {return 1;} >>>> @Override Status getStatus() {return s;} >>>> } >>>> public class AccountEntry implements Entry { >>>> private Account a; >>>> private Status s; >>>> public UserEntry(Account a, Status s) { >>>> this.a = a; >>>> this.s = s; >>>> } >>>> @Override String render() {return a.render();} >>>> @Override int size() {return a.size();} >>>> @Override Status getStatus() {return s;} >>>> } >>>> A huge majority of the clients of this Entry interface treat it >>>> polymorphically, >>>> just calling its interface methods. In only one case is there an instanceof >>>> check made on an Entry, dispatching to different methods depending on which >>>> subclass is present. >>>> Is this a good sealed type? I think not, really. There are two >>>> implementations >>>> now, but perhaps there will be a GroupEntry someday. Existing clients >>>> should >>>> continue to work in that case: the polymorphic Entry interface provides >>>> everything clients are “intended” to know. >>>> Another candidate for sealing: >>>> public interface Request {/* Empty */} >>>> public record RequestById(int id) implements Request; >>>> public record RequestByValue(String owner, boolean historic) implements >>>> Request; >>>> public class RequestFetcher { >>>> public List<Item> fetch(Iterable<Request> requests) { >>>> List<RequestById> idReqs = Lists.newArrayList(); >>>> List<RequestByValue> valueReqs = Lists.newArrayList(); >>>> List<Query> queries = Lists.newArrayList(); >>>> for (Request req : requests) { >>>> if (req instanceof RequestById) { >>>> idReqs.add((RequestById)req); >>>> } else if (req instanceof RequestByValue) { >>>> valueReqs.add((RequestByValue)req); >>>> } >>>> } >>>> queries.addAll(prepareIdQueries(idReqs)); >>>> queries.addAll(prepareValueQueries(valueReqs)); >>>> return runQueries(queries); >>>> } >>>> } >>>> Interestingly, since the Request interface is empty, the only way to do >>>> anything >>>> with this class is to cast it to one implementation type. In fact, the >>>> RequestFetcher I include here is the only usage of either of these classes >>>> (plus, of course, helpers like prepareIdQueries). >>>> So, clients need to know about specific subclasses, and want to be sure >>>> they’re >>>> doing exhaustive pattern-matching. Seems like a great sealed class to me. >>>> Except...actually each of the two subclasses has been extended by a >>>> decorator >>>> adding a source[4]: >>>> public record SourcedRequestById(Source source) extends RequestById; >>>> public record SourcedRequestByValue(Source source) extends RequestByValue; >>>> Does this argue in favor of sealing, or against? I don’t really know. The >>>> owners >>>> of Request clearly intended for all four of these subclasses to exist >>>> (they’re >>>> in the same package), so they could include them all in the permitted >>>> subtype >>>> list, but it seems like a confusing API to expose to clients. >>>> A third candidate for sealing is another simple sum type: >>>> public interface ValueOrAggregatorException<T> { >>>> T get(); >>>> public static <T> ValueOrAggregatorException<T> >>>> of(T value) { >>>> return new OfValue<T>(value); >>>> } >>>> public static <T> ValueOrAggregatorException<T> >>>> ofException(AggregatorException err) { >>>> return new OfException<T>(err); >>>> } >>>> private record OfValue<T>(T value) >>>> implements ValueOrAggregatorException<T> { >>>> @Override T get() {return value;} >>>> } >>>> private record OfException<T>(AggregatorException err) >>>> implements ValueOrAggregatorException<T> { >>>> @Override T get() {throw err;} >>>> } >>>> } >>>> It has only two subtypes, and it seems unimaginable there could ever be a >>>> third, >>>> so why not seal it? However, the subtypes are intentionally hidden: it is >>>> undesirable to let people see whether there’s an exception, except by >>>> having it >>>> thrown at you. In fact AggregatorException is documented as “plugins may >>>> throw >>>> this, but should never catch it”: there is some higher-level thing >>>> responsible >>>> for catching all such exceptions. So, this type gains no benefit from >>>> exhaustiveness checks in pattern-matching. The type is intended to be used >>>> polymorphically, through its interface method, even though its private >>>> implementation is amenable to sealing. >>>> ________________ >>>> [1] Throughout this document I will use record syntax as if it were >>>> already in >>>> the language. This is merely for brevity, and to avoid making the reader >>>> spend >>>> a lot of time reading code that boils down to just storing a couple >>>> fields. In >>>> practice, of course the code in Google’s codebase either defines the >>>> records by >>>> hand, or uses an @AutoValue. >>>> [2] Recall that @AutoValue, Google’s “record”, allows extending a class, >>>> which >>>> is semantically okay here: DbResult has no state, only behavior. >>>> [3]This data is imperfect. While the Google codebase strongly discourages >>>> having >>>> more than one version of a module checked in, there is still some amount of >>>> “vendoring” or checking in multiple versions of some package, e.g. for >>>> supporting external clients of an old version of an API. As a result, two >>>> “different files” which are really copies of each other may implement >>>> interfaces with the same fully-qualified name; I did not attempt to >>>> control for >>>> this case, and so such cases may look like they were in the same file, or >>>> not. >>>> [4] Of course in the record proposal it is illegal to extend records like >>>> this; >>>> in real life these plain data carriers are implemented by hand as ordinary >>>> classes, so the subtyping is legal.