Along the lines of the previous discussion about separate compilation skew with enums ... I'm trying to find the right place to draw the line with respect to post-compilation class hierarchy changes.

Recall that we can impose a _dominance ordering_ on patterns; pattern P dominates Q if everything that is matched by Q also is matched by P. We already use this today, in catch blocks, to reject programs with dead code; you can't say `catch Exception` before `catch IOException`, because the latter block would be dead. We want to do the same with patterns, so:

    case String x: ...
    case Object x: ...

is OK but

    case Object x: ...
    case String x: ...

is rejected at compile time.

Separately, we'd like for pattern matching to be efficient; the definition of "inefficient" would be for pattern matching to be inherently O(n), when we can frequently do much better. There's plenty of literature on compiling patterns to decision trees, but none of them address the problem we have to: separate compilation. So any decision tree computed at compile time might be wrong in undesirable ways by runtime. We could also compute a decision tree at runtime using indy; while this is our intent, the devil is in the details. We don't want computing the tree to be too expensive, nor do we want to have to capture O(n^2) compile-time constraints to be validated at runtime. So I'd like to focus on what changes we're willing to accept between compilation and runtime, what our expectations would be for those changes.

We've already discussed one of these: novel values in enum / sealed type switches, and for them, the answer is throwing some sort of exception. Another that we dealt with long ago is changing enum ordinals; we decided at the time that we're willing for this to be a BC change, so we generate extra code that uses the as-runtime ordinals rather than the as-compile-time ordinals when lowering the switch into an integer switch. (If we weren't willing to tolerate such changes, we'd have a simpler translation: just lower an enum switch to a switch on its ordinal.)

Here's one that I suspect we're not expecting to recover terribly well from: hierarchy inversion. Suppose at compile time A <: B. So the following is a sensible switch body:

    case String: println("String"); break;
    case Object: println("Object"); break;

Now, imagine that by runtime, String no longer extends Object, but instead Object absurdly extends String. Do we still expect the above to print String for all Strings, and Object for everything else? Or is the latter arm now dead at runtime, even though it wouldn't compile after the change? Or is this now UB, because it would no longer compile?

A more realistic example of a hierarchy change is introducing an interface. If we have:

    interface I { }
    class C { }

and a switch

    case I: ...
    case C: ...

and later, we make C implement I, we have a similar situation; the switch would no longer compile. Are we allowed to make optimizations based on the compile-time knowledge that C <! I?

As an example, suppose A, B, C, ... Z are final classes, and I is an interface implemented by none of them. Then I can dispatch:

    case A: ...
    case B: ...
    ...
    case I: ...
    ...
    case Z: ...
    case Object: ...

in two type operations; hash the class of the target and look it up in a table containing A...Z, and then do a test against I. However, if I'm required to deal with the case where some of A..Z are retrofitted to implement I after compile time, and I'm expected to process the switch in order based on how it is written, then I have to fall back to O(1) type operations at runtime, or, I have to do as many as O(n^2) type comparisons at link time. These are steep cliffs to fall off of. (Mandating throwing an exception at link time is also expensive.)

Today, all switch cases are totally unordered, so we're free to execute them in O(1) time. I'd like for that to continue to be the case, even as we add more complex switches.

So, let's have a conversation about expectations for what we should do for a switch at runtime that would no longer compile due to post-compilation hierarchy changes (new supertypes, hierarchy inversions, removed supertypes, final <--> nonfinal, etc.)

Reply via email to