The next big pattern matching JEP will be about deconstruction patterns.  (Static and instance patterns will likely come separately.)  Now that we've got the bikeshed painting underway, there are a few other loose ends here, and one of them is overload selection.

We explored taking the existing overload selection algorithm and turning it inside out, but after going down that road a bit, I think this both unnecessarily much complexity for not enough value, and also potentially fraught with nasty corner cases.  I think there is a much simpler answer here which is entirely good enough.

First, let's remind ourselves, why do we have constructor overloading in the first place?  There are three main reasons:

 - Concision.  If a fully-general constructor takes many parameters, but not all are essential to the use case, then the construction site becomes a site of accidental complexity.  Being able to handle common grouping of parameters simplifies use sites.

 - Flexibility.  Related to the above, not only might the user not need to specify a given constructor parameter, but they want the flexibility of saying "let the implementation pick the best value".  Constructors with fewer parameters reserve more flexibility for the implementation.

 - Alternative representations.  Some objects may take multiple representations as input, such as accepting a Date, a LocalDate, or a LocalDateTime.

The first two cases are generally handled with "telescoping constructor nests", where we have:

    Foo(A a)
    Foo(A a, B b)
    Foo(A a, B b, C d, D d)

Sometimes the telescopes don't fold perfectly, and becomes "trees":

    Foo(A a)
    Foo(A a, B b)
    Foo(A a, C c, D d)
    Foo(A a, B b, C d, D d)

Which constructors to include are subjective judgments on the part of class authors to find good tradeoffs between code size and concision/flexibility.

We had initially assumed that each constructor overload would have a corresponding deconstructor, but further experimentation suggests this is not an ideal assumption.

Clue One that it is not a good assumption comes from the asymmetry between constructors and deconstructors; if we have constructors and deconstructors of shape C(List), then it is OK to invoke C's constructor with List or its subtypes, but we can invoke C's deconstructor with List or its subtypes or its supertypes.

Clue Two is that applicability for constructors is based on method invocation context, but applicability for deconstructors is based on cast context, which has different rules.  It seems unlikely that we will ever get symmetry given this.

The "Flexibility" requirement does not really apply to deconstructors; having a deconstructor that accepts additional bindings does not constrain anything, not in the same way as a constructor taking needlessly specific arguments.  Imagine if ArrayList had only constructors that take int (for array capacity); this is terrible for the constructor, because it forces a resource management decision onto users who will not likely make a very good decision, and one that is hard to change later, but pretty much harmless for deconstructors.

The "Concision" requirement does not really apply as much to deconstructors as constructors; matching with `Foo(var a, _, _)` is not nearly as painful as invoking with lots of parameters, each of which require an explicit choice by the user.

So the main reason for overloading deconstructors is to match representations with the constructor overloads -- but with a given "representation set", there probably does not need to be as many deconstructors as constructors. What we really need is to match the "maximal" constructor in a telescoping nest with a corresponding deconstructor, or for a tree-shaped set, one for each "maximal" representation.

So for a class with constructors

    Foo()
    Foo(A a)
    Foo(A a, B B)
    Foo(X x)
    Foo(X x, Y y)

we would want dtors for (A,B) and (X,Y), but don't really need the others.


So, let's start fresh on overload selection. Deconstructors have a set of applicability rules based on arity first (eventually, varargs, but not yet) and then on applicability of type patterns, which is in turn rooted in castability.  Because we don't have the compatibility problem introduced by autoboxing, we can ignore the distinction between phase 1 and 2 of overload selection (we will have this problem with varargs later, though.)

Given this, the main question we have to resolve is to what degree -- if any -- we may deem one overload "more applicable" than others.  I think there is one rule here that is forced: an exact type match (modulo erasure) is more applicable than an inexact type match.  So given:

    D(Object o)
    D(String s)

then

    case D(String s)

should choose the latter.  This allows the client to (mostly) steer to a specific overload just by using the right types (rather than `var` or a subtype.)  It is not clear to me whether we need anything more here; in the event of ambiguity, a client can pick the right overload with the right type patterns.  (Nested patterns may need to be manually unrolled to subsequent clauses in some cases.)

So basically (on a per-binding basis): an exact match is more applicable than an inexact match, and ... that's it. Users can steer towards a particular overload by selecting exact matches on enough bindings.  Libraries can provide their own "joins" if they want to disambiguate problematic overloads like:

    D(Object o, String s)
    D(String s, Object o)



Reply via email to