Re: [External] : Re: Primitive type patterns

Brian Goetz Wed, 02 Mar 2022 12:14:03 -0800



On 3/2/2022 1:43 PM, Dan Heidinga wrote:


Making the pattern match compatible with assignment conversions makes
sense to me and follows a similar rationale to that used with
MethodHandle::asType following the JLS 5.3 invocation conversions.
Though with MHs we had the ability to add additional conversions under
MethodHandles::explicitCastArguments. With pattern matching, we don't
have the same ability to make the "extra" behaviour opt-in / opt-out.
We just get one chance to pick the right behaviour.

Indeed. And the thing that I am trying to avoid here is creating _yetanother_ new context in which a different bag of ad-hoc conversions arepossible. While it might be justifiable from a local perspective to say"its OK if `int x` does unboxing, but having it do range checking seemsnew and different, so let's not do that", from a global perspective,that means we a new context ("pattern match context") to add toassignment, loose invocation, strict invocation, cast, and numericcontexts. That is the kind of incremental complexity I'd like to avoid,if there is a unifying move we can pull.

Conversions like unboxing or casting are burdened by the fact that theyhave to be total, which means the "does it fit" / "if so, do it" / "ifnot, do something else (truncate, throw, etc)" all have to be crammedinto a single operation. What pattern matching is extracts the "does itfit, and if so do it" into a more primitive operation, from which otheroperations can be composed.

At some level, what I'm proposing is all spec-shuffling; we'll eithersay "a widening primitive conversion is allowed in assignment context",or we'll say that primitive `P p` matches any primitive type Q that canbe widened to P. We'll end up with a similar number of rules, but wemight be able to "shake the box" to make them settle to a lower energystate, and be able to define (whether we explicitly do so or not)assignment context to support "all the cases where the LHS, viewed as atype pattern, are exhaustive on the RHS, potentially with remainder, andthrows if remainder is encountered." (That's what unboxing does; throwswhen remainder is encountered.)

As to the range check, it has always bugged me that you see code thatlooks like:


    if (i >= -127 && i <= 128) { byte b = (byte) i; ... }

because of the accidental specificity, and the attendant risk of error(using <= instead of <, or using 127 instead of 128). Being able to say:


    if (i instanceof byte b) { ... }

is better not because it is more compact, but because you're actuallyasking the right question -- "does this int value fit in a byte." I'msad we don't really have a way to ask this question today; it seems anomission.

Intuitively, the behaviour you propose is kind of what we want - all
the possible byte cases end up in the byte case and we don't need to
adapt the long case to handle those that would have fit in a byte.
I'm slightly concerned that this changes Java's historical approach
and may lead to surprises when refactoring existing code that treats
unbox(Long) one way and unbox(Short) another.  Will users be confused
when the unbox(Long) in the short right range ends up in a case that
was only intended for unbox(Short)?  I'm having a hard time finding an
example that would trip on this but my lack of imagination isn't
definitive =)

I'm worried about this too. We examined it briefly, and ran away, whenwe were thinking about constant patterns, specifically:


    Object o = ...
    switch (o) {
        case 0: ...
        default: ...
    }

What would this mean? What I wouldn't want it to mean is "match Long 0,Integer 0, Short 0, Byte 0, Character 0"; that feels like it is over theline for "magic". (Note that this is about defining what the _constantpattern_ means, not the primitive type pattern.) I think its probablyreasonable to say this is a type error; 0 is applicable to primitivenumerics and their boxes, but not to Number or Object. I think that isconsistent with what I'm suggesting about primitive type patterns, butI'd have to think about it more.

Something like following shouldn't be surprising given the existing
rules around unbox + widening primitive conversion (though it may be
when first encountered as I expect most users haven't really
internalized the JLS 5.2 rules):

As Alex said to me yesterday: "JLS Ch 5 contains many more words thanany prospective reader would expect to find on the subject, but once thereader gets over the overwhelm of how much there is to say, will findnone of the words surprising." There's a deeper truth to thisstatement: Java is not actually as simple a language as its mythologysuggests, but we win by hiding the complexity in places users generallydon't have to look, and if and when they do confront the complexity,they find it unsurprising, and go back to ignoring it.

So in point of fact, *almost no one* has read JLS 5.2, but it still does"what users would likely find reasonable".

Number n = ....;
switch(n) {
   case long l -> ...
   case int i -> .... // dead code
   case byte b -> .... // dead code
   default -> ....
}

Correct. We have rules for pattern dominance, which are used to givecompile errors on dead cases; we'd have to work through the details toconfirm that `long l` dominates `int i`, but I'd hope this is the case.

But this may be more surprising as I suggested above

Number n = new Long(5);
switch(n) {
   case byte b -> .... // matches here
   case int i -> .... //
   case long l -> ...
   default -> ....
}

Overall, I like the extra dynamic range check but would be fine with
leaving it out if it complicates the spec given it feels like a pretty
deep-in-the-weeds corner case.

It is probably not a forced move to support the richer interpretation ofprimitive patterns now. But I think the consequence of doing so may besurprising: rather than "simplifying the language" (as one might hopethat "leaving something out" would do), I think there's a risk that itmakes things more complicated, because (a) it effectively creates yetanother conversion context that is distinct from the too-many we havenow, and (b) creates a sharp edge where refactoring from local variableinitialization to let-bind doesn't work, because assignment would thenbe looser than let-bind.

One reason this is especially undesirable is that one of the forms oflet-bind is a let-bind *expression*:


    let P = p, Q = q
    in <expression>

which is useful for pulling out subexpressions and binding them to avariable, but for which the scope of that variable is limited. Ifrefactoring from:


    int x = stuff;
    m(f(stuff));

to

    m(let x = stuff in f(stuff))
    // x no longer in scope here

was not possible because of a silly mismatch between the conversions inlet context and the conversions in assignment context, then we'reputting users in the position of having to choose between richerconversions and richer scoping.

(Usual warning (Remi): I'm mentioning let-expressions because it gives asense of where some of these constraints come from, but this is not asuitable time to design the let-expression feature.)

Re: [External] : Re: Primitive type patterns

Reply via email to