Switch translation

Brian Goetz Fri, 06 Apr 2018 08:53:10 -0700

The following outlines our story for translating improved switches,including both the switch improvements coming as part of JEP 325, andfollow-on work to add pattern matching to switches. Much of this hasbeen discussed already over the last year, but here it is in one place.


# Switch Translation
#### Maurizio Cimadamore and Brian Goetz
#### April 2018


## Part 1 -- constant switches

This part examines the current translation of `switch` constructs by`javac`, and proposes a more general translation for switching onprimitives, boxes, strings, and enums, with the goals of:

- Unify the treatment of `switch` variants, simplifying the compilerimplementation and reducing the static footprint of generated code; - Move responsibility for target classification from compile time torun time, allowing us to more freely update the logic without updatingthe compiler.


## Current translation

Switches on `int` (and the smaller integer primitives) are translated inone of two ways. If the labels are relatively dense, we translate an`int` switch to a `tableswitch`; if they are sparse, we translate to a`lookupswitch`. The current heuristic appears to be that we use a`tableswitch` if it results in a smaller bytecode than a `lookupswitch`(which uses twice as many bytes per entry), which is a reasonableheuristic.


#### Switches on boxes

Switches on primitive boxes are currently implemented as if they wereprimitive switches, unconditionally unboxing the target before entry(possibly throwing NPE).


#### Switches on strings

Switches on strings are implemented as a two-step process, exploitingthe fact that strings cache their `hashCode()` and that hash codes arereasonably spread out. Given a switch on strings like the one below:


    switch (s) {
        case "Hello": ...
        case "World": ...
        default: ...
    }

The compiler desugar this into two separate switches, where the firstswitch maps the input strings into a range of numbers [0..1], as shownbelow, which can then be used in a subsequent plain switch on ints. Thegenerated code unconditionally calls `hashCode()`, again possiblythrowing NPE.


    int index=-1;
    switch (s.hashCode()) {
        case 12345: if (!s.equals("Hello")) break; index = 1; break;
        case 6789: if (!s.equals("World")) break; index = 0; break;
        default: index = -1;
    }
    switch (index) {
        case 0: ...
        case 1: ...
        default: ...
    }

If there are hash collisions between the strings, the first switch musttry all possible matching strings.


#### Switches on enums

Switches on `enum` constants exploit the fact that enums have (usuallydense) integral ordinal values. Unfortunately, because an ordinal valuecan change between compilation time and runtime, we cannot rely on thismapping directly, but instead need to do an extra layer of mapping. Given a switch like:


    switch(color) {
        case RED: ...
        case GREEN: ...
    }

The compiler numbers the cases starting a 1 (as with string switch), andcreates a synthetic class that maps the runtime values of the enumordinals to the statically numbered cases:


    class Outer$0 {

synthetic final int[] $EnumMap$Color = newint[Color.values().length];

        static {

try { $EnumMap$Color[RED.ordinal()] = 1; } catch(NoSuchFieldError ex) {} try { $EnumMap$Color[GREEN.ordinal()] = 2; } catch(NoSuchFieldError ex) {}

        }
    }

Then, the switch is translated as follows:

    switch(Outer$0.$EnumMap$Color[color.ordinal()]) {
        case 1: stmt1;
        case 2: stmt2
    }

In other words, we construct an array whose size is the cardinality ofthe enum, and then the element at position *i* of such array willcontain the case index corresponding to the enum constant with whoseordinal is *i*.


## A more general scheme

The handling of strings and enums give us a hint of how to create a moreregular scheme; for `switch` targets more complex than `int`, we lowerthe `switch` to an `int` switch with consecutive `case` labels, and usea separate process to map the target into the range of synthetic caselabels.

Now that we have `invokedynamic` in our toolbox, we can reduce all ofthe non-`int` cases to a single form, where we number the cases withconsecutive integers, and perform case selection via an`invokedynamic`-based classifier function, whose static argument listreceives a description of the actual targets, and which returns an `int`identifying what `case` to select.


This approach has several advantages:
 - Reduced compiler complexity -- all switches follow a common pattern;
 - Reduced static code size;

- The classification function can select from a wide range ofstrategies (linear search, binary search, building a `HashMap`,constructing a perfect hash function, etc), which can vary over time orfrom situation to situation; - We are free to improve the strategy or select an alternate strategy(say, to optimize for startup time) without having to recompile the code; - Hopefully at least, if not more, JIT-friendly than the existingtranslation.

We can also use this approach in preference to `lookupswitch` fornon-dense `int` switches, as well as use it to extend `switch` to handle`long`, `float`, and `double` targets (which were surely excluded inpart because the JVM didn't provide a convenient translation target forthese types.)


#### Bootstrap design

When designing the `invokedynamic` bootstraps to support thistranslation, we face the classic lumping-vs-splitting decision. For now,we'll bias towards splitting. In the following example,`BOOTSTRAP_PREAMBLE` indicates the usual leading arguments for an indybootstrap. We assume the compiler has numbered the case values denselyfrom 0..N, and the bootstrap will return [0,n) for success, or N for "nomatch".


A strawman design might be:

    // Numeric switches for P, accepts invocation as P -> I or Box(P) -> I
    CallSite intSwitch(BOOTSTRAP_PREAMBLE, int... caseValues)

    // Switch for String, invocation descriptor is String -> I
    CallSite stringSwitch(BOOTSTRAP_PREAMBLE, String... caseValues)

    // Switch for Enum, invocation descriptor is E -> I

CallSite enumSwitch(BOOTSTRAP_PREAMBLE, Class<Enum<E extendsEnum<E>>> clazz,

                        String... caseNames)

It might be possible to encode all of these into a single bootstrap, butgiven that the compiler already treats each type slightly differently,it seems there is little value in this sort of lumping for non-patternswitches.

The `enumSwitch` bootstrap as proposed uses `String` values to describethe enum constants, rather than encoding the enum constants directly viacondy. This allows us to be more robust to enums disappearing aftercompilation.

This strategy is also dependent on having broken the limitation on 253bootstrap arguments in indy/condy.


#### Extending to other primitive types

This approach extends naturally to other primitive types (long, double,float), by the addition of some more bootstraps (which need to deal withthe additional complexities of infinity, NaN, etc):


    CallSite longSwitch(BOOTSTRAP_PREAMBLE, long... caseValues)
    CallSite floatSwitch(BOOTSTRAP_PREAMBLE, float... caseValues)
    CallSite doubleSwitch(BOOTSTRAP_PREAMBLE, double... caseValues)

#### Extending to null

The scheme as proposed above does not explicitly handle nulls, which isa feature we'd like to have in `switch`. There are a few ways we couldadd null handling into the API:


 - Split entry points into null-friendly or null-hostile switches;

- Find a way to encode nulls in the array of case values (which can bedone with condy); - Always treat null as a possible input and a distinguished output,and have the compiler ensure the switch can handle this distinguishedoutput.

The last strategy is appealing and straightforward; assign a sentinelvalue (-1) to `null`, and always return this sentinel when the input isnull. The compiler ensures that some case handles `null`, and if nocase handles `null` then it inserts an implicit


    case -1: throw new NullPointerException();

into the generated code.

#### General example

If we have a string switch:

    switch (x) {
        case "Foo": m(); break;
        case "Bar": n(); // fall through
        case "Baz": r(); break;
        default: p();
    }

we translate into:

    int t = indy[bsm=stringSwitch["Foo", "Bar", "Baz"]](x)
    switch (t) {
        case -1: throw new NullPointerException();  // implicit null case
        case 0: m(); break;
        case 1: n(); // fall through
        case 2: r(); break;
        case 3: p();                                // default case
    }

All switches, with the exception of `int` switches (and maybe not evennon-dense `int` switches), follow this exact pattern. If the targettype is not a reference type, the `null` case is not needed.

This strategy is implemented in the `switch` branch of the amberrepository; see `java.lang.runtime.SwitchBootstraps` in that branch for(rough!) implementations of the bootstraps.


## Patterns in narrow-target switches

When we add patterns, we may encounter switches whose targets aretightly typed (e.g., `String` or `int`) but still use some patterns intheir expression. For switches whose target type is a primitive,primitive box, `String`, or `enum`, we'd like to use the optimizedtranslation strategy outlined here, but the following kinds of patternsmight still show up in a switch on, say, `Integer`:


    case var x:
    case _:
    case Integer x:
    case Integer(var x):

The first three can be translated away by the source compiler, as theyare semantically equivalent to `default`. If any nontrivial patternsare present (including deconstruction patterns), we may need totranslate as a pattern switch scheme -- see Part 2. (While the languagemay not distinguish between "legacy" and "pattern" switches -- in thatall switches are pattern switches -- we'd like to avoid giving upobvious optimizations if we can.)


# Part 2 -- type test patterns and guards

A key motivation for reexamining switch translation is the impendingarrival of patterns in switch. We expect switch translation for thepattern case to follow a similar structure -- lower to an `int` switchand use an indy-based classifier to select an index. However, there area few additional complexities. One is that pattern cases may haveguards, which means we need to be able to re-enter the bootstrap with anindication to "continue matching from case N", in the event of a failedguard. (Even if the language doesn't support guards directly, theobvious implementation strategy for nested patterns is to desugar theminto guards.)

Translating pattern switches is more complicated because there are moreoptions for how to divide the work between the statically generated codeand the switch classifier, and different choices have differentperformance side-effects (are binding variables "boxed" into a tuple tobe returned, or do they need to be redundantly calculated).


## Type-test patterns

Type-test patterns are notable because their applicability predicate ispurely based on the type system, meaning that the compiler can directlyreason about it both statically (using flow analysis, optimizing awaydynamic type tests) and dynamically (with `instanceof`.) A switchinvolving type-tests:


    switch (x) {
        case String s: ...
        case Integer i: ...
        case Long l: ...
    }

can (among other strategies) be translated into a chain of `if-else`using `instanceof` and casts:


    if (x instanceof String) { String s = (String) x; ... }
    else if (x instanceof Integer) { Integer i = (Integer) x; ... }
    else if (x instanceof Long) { Long l = (Long) x; ... }

#### Guards

The `if-else` desugaring can also naturally handle guards:

    switch (x) {
        case String s
            where (s.length() > 0): ...
        case Integer i
            where (i > 0): ...
        case Long l
            where (l > 0L): ...
    }

can be translated to:

    if (x instanceof String
        && ((String) x).length() > 0) { String s = (String) x; ... }
    else if (x instanceof Integer
             && ((Integer) x) > 0) { Integer i = (Integer) x; ... }
    else if (x instanceof Long
             && ((Long) x) > 0L) { Long l = (Long) x; ... }

#### Performance concerns

The translation to `if-else` chains is simple (for switches withoutfallthrough), but is harder for the VM to optimize, because we've used amore general control flow mechanism. If the target is an empty`String`, which means we'd pass the first `instanceof` but fail theguard, class-hierarchy analysis could tell us that it can't possibly bean `Integer` or a `Long`, and so there's no need to perform those tests.But generating code that takes advantage of this information is morecomplex.

In the extreme case, where a switch consists entirely of type testpatterns for final classes, this could be performed as an O(1) operationby hashing. And this is a common case involving switches overalternatives in a sum (sealed) type. (We shouldn't rely on finality atcompile time, as this can change between compile and run time, but weshould take advantage of this at run time if we can.)

Finally, the straightforward static translation may miss opportunitiesfor optimization. For example:


    switch (x) {
        case Point p
            where p.x > 0 && p.y > 0: A
        case Point p
            where p.x > 0 && p.y == 0: B
    }

Here, not only would we potentially test the target twice to see if itis a `Point`, but we then further extract the `x` component twice andperform the `p.x > 0` test twice.


#### Optimization opportunities

The compiler can eliminate some redundant calculations throughstraightforward techniques. The previous switch can be transformed to:


    switch (x) {
        case Point p:
            if (((Point) p).x > 0 && ((Point) p).y > 0) { A }
            else if (((Point) p).x > 0 && ((Point) p).y > 0) { B }

to eliminate the redundant `instanceof` (and admits further CSEoptimizations.)


#### Clause reordering

The above example was easy to transform because the two `case Point`clauses were adjacent. But what if they are not? In some cases, it issafe to reorder them. For types `T` and `U`, it is safe to reorder`case T` and `case U` if the two types have no intersection; that therecan be no types that are subtypes of them both. This is true when `T`and `U` are classes and neither extends the other, or when one is afinal class and the other is an interface that the class does notimplement.

The compiler could then reorder case clauses so that all the ones whosefirst test is `case Point` are adjacent, and then coalesce them all intoa single arm of the `if-else` chain.

A possible spoiler here is fallthrough; if case A falls into case B,then cases A and B have to be moved as a group. (This is another reasonto consider limiting fallthrough.)

A bigger possible spoiler here is separate compilation. If at compiletime, we see that `T` and `U` are disjoint types, do we want to bakethat assumption into the compilation, or do we have to re-check thatassumption at runtime?


#### Summary of if-else translation

While the if-else translation at first looks pretty bad, we are able toextract a fair amount of redundancy through well-understood compilertransformations. If an N-way switch has only M distinct types in it, inmost cases we can reduce the cost from _O(N)_ to _O(M)_. Sometimes _M== N_, so this doesn't help, but sometimes _M << N_ (and sometimes `N`is small, in which case _O(N)_ is fine.)

Reordering clauses involves some risk; specifically, that the classhierarchy will change between compile and run time. It seems eminentlysafe to reorder `String` and `Integer`, but more questionable to reorderan arbitrary class `Foo` with `Runnable`, even if `Foo` doesn'timplement `Runnable` now, because it might easily be changed to do solater. Ideally we'd like to perform class-hierarchy optimizations usingthe runtime hierarchy, not the compile-time hierarchy.


## Type classifiers

The technique outlined in _Part 1_, where we lower the complex switch toa dense `int` switch, and use an indy-based classifier to select anindex, is applicable here as well. First let's consider a switchconsisting only of unguarded type-test patterns, optionally with adefault clause.

We'll start with an `indy` bootstrap whose static argument are `Class`constants corresponding to each arm of the switch, whose dynamicargument is the switch target, and whose return value is a case number(or distinguished sentinels for "no match" and `null`.) We can easilyimplement such a bootstrap with a linear search, but can also do better;if some subset of the classes are `final`, we can choose between thesemore quickly (such as via binary search on `hashCode()`, hash function,or hash table), and we need perform only a single operation to test allof those at once. Dynamic techniques (such as a building a hash map ofpreviously seen target types), which `indy` is well-suited to, canasymptotically approach _O(1)_ even when the classes involved are notfinal.


So we can lower:

    switch (x) {
        case T t: A
        case U u: B
        case V v: C
    }

to

    int y = indy[bootstrap=typeSwitch(T.class, U.class, V.class)](x)
    switch (y) {
        case 0: A
        case 1: B
        case 2: C
    }

This has the advantages that the generated code is very similar to thesource code, we can (in some cases) get _O(1)_ dispatch performance, andwe can handle fallthrough with no additional complexity.


#### Guards

There are two approaches we could take to add support for guards intothe process; we could try to teach the bootstrap about guards (and wouldhave to pass locals that appear in guard expressions as additionalarguments to the classifier), or we could leave guards to the generatedbytecode. The latter seems far more attractive, but requires sometweaks to the bootstrap arguments and to the shape of the generated code.

If the classifier says "you have matched case #3", but then we fail theguard for #3, we want to go back into the classifier and start again at#4. (Sometimes the classifier can also use this information ("startover at #4") to optimize away unnecessary tests.)

We add a second argument (where to start) to the classifier invocationsignature, and wrap the switch in a loop, lowering:


    switch (target) {
        case T t where (e1): A
        case T t where (e2): B
        case U u where (e3): C
    }

into

    int index = -1; // start at the top
    while (true) {
        index = indy[...](target, index)
        switch (index) {
            case 0: if (!e1) continue; A
            case 1: if (!e2) continue; B
            case 2: if (!e3) continue; C
            default: break;
        }
        break;
    }

For cases where the same type test is repeated in consecutive positions(at N and N+1), we can have the static compiler coalesce them as above,or we could have the bootstrap maintain a table so that if you re-enterthe bootstrap where the previous answer was N, then it can immediatelyreturn N+1. Similarly, if N and N+1 are known to be mutually exclusivetypes (like `String` and `Integer`), on reentering the classifier withN, we can skip right to N+2 since if we matched `String`, we cannotmatch `Integer`. Lookup tables for such optimizations can be built atcallsite linkage time.


#### Mixing constants and type tests

This approach also extends to tests that are a mix of constant patternsand type-test patterns, such as:


    switch (x) {
        case "Foo": ...
        case 0L: ...
        case Integer i:
    }

We can extend the bootstrap protocol to accept constants as well astypes, and it is a straightforward optimization to combine both typematching and constant matching in a single pass.


## Nested patterns

Nested patterns are essentially guards; even if we don't expose guardsin the language, we can desugar


    case Point(0, var x):

into the equivalent of

    case Point(var a, var x) && a matches 0:

using the same translation story as above -- use the classifier toselect a candidate case arm based on the top-type of the pattern, andthen do additional checks in the generated bytecode, and if the checksfail, continue and re-enter the classifier starting at the next case.


#### Explicit continue

An alternative to exposing guards is to expose an explicit `continue`statement in switch, which would have the effect of "keep matching atthe next case." Then guards could be expressed imperatively as:


    case P:
        if (!guard)
            continue;
        ...
        break;
    case Q: ...

Switch translation

Reply via email to