from:"Brian Goetz"

Re: with and binary backward compatibility

2022-06-14 Thread Brian Goetz





The problem is that if we run that code with the new version of Point (the one 
with 3 components),
newPoint.z is not equals to point.z but to 0, so once there is a 'with' 
somewhere, there is no backward compatibility anymore.


Yes, in that case, we experience something like "decapitation" of the z 
value.   But note the same thing is true without the `with` mechanism!


    // Old code
    record Point(int x, int  y) { }

    class Foo {
        Point increment(Point p) {
            return new Point(p.x() + 1, p.y() + 1);
        }
    }

    // modify Point and recompile Point but not Foo
    record Point(int x, int y, int z) {
    Point(int x, int y) { this(x,y,0); }
    }

    // new code
    Point p = new Point(1,2,3);
    System.out.println(foo.inc(p));  // Point[2,3,0]

This has nothing to do with "with"; without with, we have the same 
"brittle" constructor/deconstructor selection.  With "with", we have the 
same, it's just not as obvious.   Actually, it's better, because then 
its only a separate compilation artifact; if you recompile the 
with-using client, things are righted, whereas with the explicit 
ctor-and-accessor version, it is never righted.  So `with` is actually a 
compatibility improvement here, in that the brittle selection goes away 
after recompilation.

Re: "With" for records

2022-06-12 Thread Brian Goetz




On 6/12/2022 12:21 PM, fo...@univ-mlv.fr wrote:





*From: *"Brian Goetz" 
*To: *"Remi Forax" 
*Cc: *"amber-spec-experts" 
*Sent: *Saturday, June 11, 2022 8:16:26 PM
*Subject: *Re: "With" for records

We also probably want a rule to _prevent_ assignment to any locals
*other than* the synthetic component locals.  Assigning to uplevel
locals from within a `with` block seems like asking for trouble;
the with block is like a transform on the component locals. 



perhaps any locals other that the synthetic ones and the ones declared 
inside the block.


Yes, sorry that's exactly what I meant.  The reason for this rule is 
perhaps not obvious, but when we get to arbitrary ctor/dtor pairs, we 
will need to do _overload selection_ based on the names used in the 
block.  And it needs to be unambiguous which assignments in the block 
are intended to be to components.  For locals declared in the block, we 
can conclude that assignments to these are not component assignments.




it is possible that the inner block might want additional
information from one of the enclosing `contents` variables. 



or inside the block we may want to have access to the parameters, like in:
  record Complex(double re, double im) {
    Complex withRe(double re) {
  return this with { re = re_from_outer_scope; }  // find a syntax 
here !

    }
  }


My hope is we do not have to find a syntax.  As you say, we can 
introduced an intermediate local:


    int outerRe = re;
    return this with { re = outerRe }

This also is a good candidate for further refinement with our friend 
"let expressions":


    return let int outeRe = re
    in this with { re = outerRe }



and two other related questions about the syntax
- do we allow to not use curly braces if there is only one assignment
    complex with re = 3
- or do we allow to avoid the last semicolon if there is only one 
assignment like in your example

    complex with { re = 3 }


Yes, these are reasonable special cases to consider (as we do with 
single-argument lambdas, or throw on the RHS of an arrow case.)

Re: "With" for records

2022-06-11 Thread Brian Goetz

We also probably want a rule to _prevent_ assignment to any locals 
*other than* the synthetic component locals.  Assigning to uplevel 
locals from within a `with` block seems like asking for trouble; the 
with block is like a transform on the component locals.


However, we may need a story (hope not) for _accessing_ uplevel shadowed 
locals.  For example:


    record R(A contents) { }
    record A(B contents) { }
    record B(int contents) { }

    R r = ...
    R rr = r with { contents = contents with { contents = 3 }}

it is possible that the inner block might want additional information 
from one of the enclosing `contents` variables.




On 6/10/2022 11:25 AM, Brian Goetz wrote:
About the declaration of local variables, in Java, there is no 
hiding/shadowing between local variables, so a code like this is 
rejected ? Or do we introduce a special rule for the hiding of 
implicit variables ?


Yes, we probably do need a special rule for this.  The component names 
are fixed, and collisions are likely, so these synthetic variables 
will probably have to be allowed to shadow other locals.

Re: "With" for records

2022-06-11 Thread Brian Goetz

I got a private mail asking, basically: why not "just" generate withers 
for each component, so you could say `point.x(newX).y(newY)`.


This would be a much weaker feature than is being proposed, in several 
dimensions.


1.  It doesn't scale to arbitrary classes; it's a record-specific hack.  
Which means value classes are left out of the cold, as are immutable 
classes that can't be records or values for whatever reason.  The link I 
cited suggests how we're going to get to arbitrary classes; I wouldn't 
support this feature if we couldn't get there.


2.  It is strictly less powerful.  Say you have a record with an 
invariant that constraints multiple fields, such as:


    record OddOrEvenPair(int a, int b) {
    OddOrEvenPair {
    if (a % 2 != b % 2)
    throw new IllegalArgumentException();
    }
    }

This requires that a and b both be even, or both be odd.  Note that 
there's no path from (2, 2) to (3, 3); any attempt to do `new OOEP(2, 
2).a(3).b(3)` will fail when we try to reconstruct the intermediate 
state.  You need a wither that does both a and b at once.  (And we're 
not going to generate the 2^n combinations.)




On 6/10/2022 8:44 AM, Brian Goetz wrote:

In

https://github.com/openjdk/amber-docs/blob/master/eg-drafts/reconstruction-records-and-classes.md

we explore a generalized mechanism for `with` expressions, such as:

    Point shadowPos = shape.position() with { x = 0 }

The document evaluates a general mechanism involving matched pairs of 
constructors (or factories) and deconstruction patterns, which is 
still several steps out, but we can take this step now with records, 
because records have all the characteristics (significant names, 
canonical ctor and dtor) that are needed.  The main reason we might 
wait is if there are uncertainties in the broader target.


Our C# friends have already gone here, in a way that fits into C#, 
using properties (which makes sense, as their language is built on that):


    object with { property-assignments }

The C# interpretation is that the RHS of this expression is a sort of 
"DSL", which permits property assignment but nothing else.  This is 
cute, but I think we can do better.


In our version, the RHS is an arbitrary block of Java code; you can 
use loops, assignments, exceptions, etc.  The only thing that makes it 
"special" is that that the components of the operand are lifted into 
mutable locals on the RHS.  So inside the RHS when the operand is a 
Point, there are fresh mutable locals `x` and `y` which are 
initialized with the X and Y values of the operand.  Their values are 
committed at the end of the block using the canonical constructor.


This should remind people of the *compact constructor* in a record; 
the body is allowed to freely mutate the special variables (who also 
don't have obvious declarations), and their terminal values determine 
the state of the record.


Just as we were able to do record patterns without having full-blown 
deconstructors, we can do with expressions on records as well, because 
(a) we still have a canonical ctor, (b) we have accessors, and (c) we 
know the names of the components.


Obviously when we get value types, we'll want classes to be able to 
expose (or not) such a mechanism (both for internal or external use).


 Digression: builders

As a bonus, I think `with` offers us a better path to getting rid of 
builders than the (problematic) one everyone asks for (default values 
on constructor parameters.)  Consider the case of a record with many 
components, all of which are optional:


    record Config(int a,
  int b,
  int c,
  ...
  int z) {
    }

Obviously, no one wants to call the canonical constructor with 26 
values.  The standard workaround is a builder, but that's a lot of 
ceremony.  The `with` mechanism gives us a way out:


    record Config(int a,
  int b,
  int c,
  ...
  int z) {

    private Config() {
    this(0, 0, 0, ... 0);
    }

    public static Config BUILDER = new Config();
    }

Now we can just say

    Config c = Config.BUILDER with { c = 3; q = 45; }

The constant isn't even necessary; we can just open up the 
constructor.  And if there are some required args, the constructor can 
expose them too.  Suppose a and b are required, but c..z are 
optional.  Then:


    record Config(int a,
  int b,
  int c,
  ...
  int z) {

    public Config(int a, int b) {
    this(a, b, 0, ... 0);
    }
    }

    Config c = new Config(1, 2) with { c = 3; q = 45; }

In this way, the record acts as its own builder.

(As an added bonus, the default values do not suffer from the "brittle 
constant" problem that a default value would likely suffer from, 
because they are an

Re: "With" for records

2022-06-10 Thread Brian Goetz





The block is also special because there is an implicit return at the 
end ? so i believe "return" should be disallowed inside the block, right ?


That's a good question; we can go in multiple directions with this. One 
would be to simply interpret "return" as "return from the enclosing 
method".  We confronted this with the blocks on the RHS of -> in switch 
expressions, and decided to disallow return there; that's probably a 
good default choice here too.




Does the lifting that appears at the beginning of the block call all 
accessors ? or the compiler try to be clever and not call an accessor 
when its value is not needed.


There's several reasons to just call them all, the biggest of which is 
that the accessors are incidental; really, there's a synthetic 
deconstruction pattern in the record, whose implementation just may 
delegate to accessors.  There is little semantic or performance benefit 
to not calling them (there shouldn't be side effects in accessors 
anyway, and the JIT will likely inline the calls away and see that 
unneeded fetches are dead anyway.)



For example, in
  point with { y = 3; }
calling point.y() is useless, or is it something the JIT will take 
care of ?


For an ordinary record, the accessor is a field access, it will get 
inlined, and the fetch will be dead.  So, yes.


Do we allow "with" followed by an empty block ? As a way to clone a 
record ?


Yes.  The RHS is just a block of Java.  If it does nothing, it does 
nothing.  Free cloning.


About the declaration of local variables, in Java, there is no 
hiding/shadowing between local variables, so a code like this is 
rejected ? Or do we introduce a special rule for the hiding of 
implicit variables ?


Yes, we probably do need a special rule for this.  The component names 
are fixed, and collisions are likely, so these synthetic variables will 
probably have to be allowed to shadow other locals.


yes, Complex.default with { re = 3; im = 4; } seems a great fit for 
value classes.


Or even better:

    value class Complex {
    ...
    Complex conj() -> this with { im = -im; }
    }

"With" for records

2022-06-10 Thread Brian Goetz


In

https://github.com/openjdk/amber-docs/blob/master/eg-drafts/reconstruction-records-and-classes.md

we explore a generalized mechanism for `with` expressions, such as:

    Point shadowPos = shape.position() with { x = 0 }

The document evaluates a general mechanism involving matched pairs of 
constructors (or factories) and deconstruction patterns, which is still 
several steps out, but we can take this step now with records, because 
records have all the characteristics (significant names, canonical ctor 
and dtor) that are needed.  The main reason we might wait is if there 
are uncertainties in the broader target.


Our C# friends have already gone here, in a way that fits into C#, using 
properties (which makes sense, as their language is built on that):


    object with { property-assignments }

The C# interpretation is that the RHS of this expression is a sort of 
"DSL", which permits property assignment but nothing else.  This is 
cute, but I think we can do better.


In our version, the RHS is an arbitrary block of Java code; you can use 
loops, assignments, exceptions, etc.  The only thing that makes it 
"special" is that that the components of the operand are lifted into 
mutable locals on the RHS.  So inside the RHS when the operand is a 
Point, there are fresh mutable locals `x` and `y` which are initialized 
with the X and Y values of the operand.  Their values are committed at 
the end of the block using the canonical constructor.


This should remind people of the *compact constructor* in a record; the 
body is allowed to freely mutate the special variables (who also don't 
have obvious declarations), and their terminal values determine the 
state of the record.


Just as we were able to do record patterns without having full-blown 
deconstructors, we can do with expressions on records as well, because 
(a) we still have a canonical ctor, (b) we have accessors, and (c) we 
know the names of the components.


Obviously when we get value types, we'll want classes to be able to 
expose (or not) such a mechanism (both for internal or external use).


 Digression: builders

As a bonus, I think `with` offers us a better path to getting rid of 
builders than the (problematic) one everyone asks for (default values on 
constructor parameters.)  Consider the case of a record with many 
components, all of which are optional:


    record Config(int a,
  int b,
  int c,
  ...
  int z) {
    }

Obviously, no one wants to call the canonical constructor with 26 
values.  The standard workaround is a builder, but that's a lot of 
ceremony.  The `with` mechanism gives us a way out:


    record Config(int a,
  int b,
  int c,
  ...
  int z) {

    private Config() {
    this(0, 0, 0, ... 0);
    }

    public static Config BUILDER = new Config();
    }

Now we can just say

    Config c = Config.BUILDER with { c = 3; q = 45; }

The constant isn't even necessary; we can just open up the constructor.  
And if there are some required args, the constructor can expose them 
too.  Suppose a and b are required, but c..z are optional.  Then:


    record Config(int a,
  int b,
  int c,
  ...
  int z) {

    public Config(int a, int b) {
    this(a, b, 0, ... 0);
    }
    }

    Config c = new Config(1, 2) with { c = 3; q = 45; }

In this way, the record acts as its own builder.

(As an added bonus, the default values do not suffer from the "brittle 
constant" problem that a default value would likely suffer from, because 
they are an implementation detail of the constructor, not an exposed 
part of the API.)



I think it is reasonable at this point to take this idea off the shelf 
and work towards delivering this for records, while we're building out 
the machinery needed to deliver this for general classes.  It has no 
remaining dependencies and is immediately useful for records.


(As usual, please hold comments on small details until everyone has had 
a chance to comment on the general direction.)

Re: Simplifying switch labels

2022-06-03 Thread Brian Goetz





I wonder how we would try to explain 'case Integer x, null, Double x'... (Does 
'x' get bound to 'null'? How?)


Your suggestion to always put the null first probably helps.

Re: Simplifying switch labels

2022-06-02 Thread Brian Goetz




Oh, I guess I missed your point here, thinking that P and Q were 
constants.


Your comment implies that the two rules that restrict usage of 
patterns—can't fall through past one, and can't combine one (via ',') 
with most other labels—could be relaxed slightly in the case of 
patterns that have no bindings. I suppose that's formally true, though 
I'm not sure it's practically all that useful. (The only non-binding 
pattern we have right now is a zero-component record, right? And any 
non-binding patterns in the future could be equivalently expressed 
with 'when' clauses.)


Here's an example that's not so contrived:

    String kind = switch (o) {
    case Integer _, Long _, Short _, Character _, Byte _ -> "integral";
    case Double _, Float _ -> "floating point";
    case Boolean _ -> "boolean";
    default -> "something else";
    };

Once we have a "don't care" pattern, any pattern can become binding-less.

Looking two steps ahead, we might decide it is not so much that there 
can be _no_ bindings, as much as are the bindings unifiable:


    case Bag(String x), Box(String x) -> "container of string";

Re: Simplifying switch labels

2022-06-02 Thread Brian Goetz






In this framing, the restrictions about sets of elements in a single label 
don't apply, because we're talking about two different labels. But we have 
rules to prevent various abuses. Examples:

case 23: case Pattern: // illegal before and now, due to fallthrough Pattern 
rule


Ideally, the fallthrough rule should be about _bindings_, not 
_patterns_.  If P an Q are patterns with no binding variables, then it 
should be OK to say:


    case P:
    case Q:

The rule about fallthrough is to prevent falling into code where the 
bindings are not DA.



Note that the second kind of Pattern SwitchLabel is especially weird—it binds 'null' to a 
pattern variable, and requires the pattern to be a (possibly parenthesized) type pattern. 
So it's nice to break it out as its own syntactic case. I'd also suggest rethinking 
whether "case null," is the right way to express the two kinds of nullable 
SwitchLabels, but anyway now it's really clear that they are special.


This is a painful compromise.  While this may be a transitional 
phenomena, the rule "switch matches null only when `null` appears in a 
case" is a helpful way to transition people away from "switch always 
throws on null."

Re: Named record pattern

2022-06-01 Thread Brian Goetz



For me, there is a difference between a binding and a local variable, 
is that bindings are the one declared inside a pattern and a local 
variables is how a binding is transformed to be usable inside the boby 
if the pattern match.


Our first (wrong) inclination was to treat pattern variables as a whole 
separate thing from locals.  This, in turn, led to further gratuitous 
divergence such as the treatment of finality.


The big difference between a pattern variable and a local is their 
scope.  We have carefully defined the scope of a pattern variable to be 
exactly those places where it would be DA.


So we can merge bindings and as a result have one local variable to be 
used in the body.


This only works if we are cagy about *where* the declaration is.  If we 
say that in


    case Foo(int x), Bar(int x):

that both `int x` are separate declarations, then we've foreclosed on 
this avenue.  If instead we let patterns "summon locals into existence", 
then if they sing in proper harmony, then two patterns can summon the 
same variable.  While we're not 100% dedicated to this, ruling it out 
also seems questionable.


About the annotations, if we follow the data orientation principle 
(data is more important than code), we can have a record

  record Person(@NonNull String name) { }


This is an annotation *on the record component*.  There are rules for 
how annotations flow from components to other API facets (fields, ctor 
arguments, accessor methods, etc.)



and a record pattern
  case Person(@NonNull String s) -> ...


The question is: what is being annotated here?  The pattern itself, the 
binding variable, or the type-use of String?  Annotations on patterns 
would be a little like annotations on record components; they may flow 
through to other things, but the pattern and the binding are separate.


We want to be able to declare @NonNull in the record pattern so if the 
data is changed, if the component  name of the record becomes nullable 
by example, the pattern will fail to compile.
So i think we should allow annotations because it's a way to enforce 
that data are more important that code.


This is awfully handwavy; I'd prefer to make these decisions on some 
other basis than "it seems to arrive at the answer I want in this 
case."  In fact, everything about this argument is making me think 
annotations on pattern variables is a serious mistake.


But there is a simple solution, annotation are trees, so they can be 
compared structurally. Thus we can do bindings merging if the 
annotations are the same structurally.


I don't think we want to touch this with a ten foot pole.

Re: Named record pattern

2022-05-31 Thread Brian Goetz

Gavin reminded me that we are not finalizing patterns in switch in 19 
(hard to keep track, sometimes), so we have a little bit of time to 
figure out what we want here.


One thing that is potentially confusing is that patterns work indirectly 
in a number of ways.  For example, if we have a declared deconstruction 
pattern for Point, you can't *invoke* it as you can a method; the 
language runtime invokes it on your behalf under the right situations.  
(In this way, a deconstructor is a little like a static initializer; it 
is a body of code that you declare, but you can't invoke it directly, 
the runtime invokes it for you at the right time, and that's fine.)


I had always imagined the relationship with locals being similar; a 
pattern causes a local to be injected into certain scopes, but the 
pattern itself is not a local variable declaration.  Obviously there is 
more than one way to interpret this, so we should make a more deliberate 
decision.


As a confounding example that suggests that pattern variables are not 
"just locals", in the past we talked about various forms of "merging":


    if (t instanceof Box(String s) || t instanceof Bag(String s)) { ... }

or

    case Box(String s):
    case Bag(String s):
    common-code;

If pattern variables could be annotated, then the language would be in 
the position of deciding what happens with


    case Box(@Foo(1) String s):
    case Bag(@Foo(2) String s):

(This is the "annotation merging" problem, which is why annotations are 
not inherited in the first place.)


I don't have an answer here, but I'm going to think about the various 
issues and try to capture them in more detail before proposing an answer.



On 5/31/2022 10:49 AM, Brian Goetz wrote:




Erm... I actually thought that it was your idea to allow the 'final'
modifier on patterns. This change was introduced in Java 16 (when
patterns for instanceof were finalized). Here's the initial e-mail
from you (item 2):
https://mail.openjdk.java.net/pipermail/amber-spec-experts/2020-August/002433.html 



That mail is exactly the discussion point I was thinking of.  But I 
never said that there should be a way to declare them as final at 
all!  I said it was a mistake to have made them automatically final, 
and that patterns should introduce ordinary mutable locals.  I wasn't 
suggesting an option, just that we'd picked the wrong default (and 
created a new category of complexity in the process.)


But, its an honest leap from there to "well of course me must have 
meant you could declare them final."  But had this been explicitly 
raised, I would have not been in favor of this option, for two reasons:


 - The conversation we are having now -- it was clear that eventually, 
some more complex pattern would introduce variables in a way such that 
there was not an obvious "local variable" declaration, and that we 
would eventually be having a "for consistency" discussion;


 - The value of being able to declare these things final is almost 
zero; the only reason we are having this conversation at all is "for 
consistency" with local variables.  But if someone said "should we add 
a feature to let you make pattern variables final", the "meh" would 
have been deafening.



instanceof @Cartesian Point p'. It looks like I cannot do the same in
the second case, which is another asymmetry.

We definitely intended to not allow declaration annotations.

But they are allowed for type test patterns, since Java 16.


Yeah, we've got a problem.

Re: Named record pattern

2022-05-31 Thread Brian Goetz






Erm... I actually thought that it was your idea to allow the 'final'
modifier on patterns. This change was introduced in Java 16 (when
patterns for instanceof were finalized). Here's the initial e-mail
from you (item 2):
https://mail.openjdk.java.net/pipermail/amber-spec-experts/2020-August/002433.html


That mail is exactly the discussion point I was thinking of.  But I 
never said that there should be a way to declare them as final at all!  
I said it was a mistake to have made them automatically final, and that 
patterns should introduce ordinary mutable locals.  I wasn't suggesting 
an option, just that we'd picked the wrong default (and created a new 
category of complexity in the process.)


But, its an honest leap from there to "well of course me must have meant 
you could declare them final."  But had this been explicitly raised, I 
would have not been in favor of this option, for two reasons:


 - The conversation we are having now -- it was clear that eventually, 
some more complex pattern would introduce variables in a way such that 
there was not an obvious "local variable" declaration, and that we would 
eventually be having a "for consistency" discussion;


 - The value of being able to declare these things final is almost 
zero; the only reason we are having this conversation at all is "for 
consistency" with local variables.  But if someone said "should we add a 
feature to let you make pattern variables final", the "meh" would have 
been deafening.



instanceof @Cartesian Point p'. It looks like I cannot do the same in
the second case, which is another asymmetry.

We definitely intended to not allow declaration annotations.

But they are allowed for type test patterns, since Java 16.


Yeah, we've got a problem.

Re: It's the data, stupid !

2022-05-30 Thread Brian Goetz


The problem is not at callee site, as you said you have deconstructor binding 
like you have constructor parameter, the problem is at callsite, when you have 
a Type Pattern, a type pattern does not declare a type that can be used at 
compile time but a class that is used at runtime (to do the instanceof).
So the problem is not how to declare several deconstructors, it's how to select 
the right one without type information.

Overload selection works largely the same as it does with constructors, just 
with some of of the “arrows reversed”.

But I think you’re extrapolating from deconstructors too much, which are a very 
constrained kind of declared pattern.  A pattern combines an applicability test 
(does the target match the pattern), zero or more conditional extractions 
(operations performed on the target only when it is known to match), and 
creation of variables to receive the extracted data.  For a deconstruction 
pattern, the applicability test is highly constrained — it is a class-based 
instanceof test, and the language knows it.  If its of the right type, the 
deconstruction must succeed — it is *total* on that type.  But this is a very 
special case (and a  very useful case, which is why we are doing them first.)

Consider a more complex situation, such as querying whether a regex matches a 
string, and if so, extracting the groups.  Here, we can use whatever state we 
like as our match criteria; we are not constrained to work only with runtime 
classes.  More importantly, the code to determine the match and extract the 
parameters is completed; if we exposed this as a boolean-returning method 
(“does it match”), we’d have to do all the work over again when we go to 
extract the groups (its like containsKey and get in Map.)  This is why the 
current regex implementation returns a stateful matcher object.

Class-based matching is just the easy case.

nope, because if we have a type pattern / record pattern we have no type 
information.
If named pattern method allows overloading, we will have the same issue, there 
is not enough type information at callsite.

I’m not following this argument.  Can you try to write out some end-to-end 
examples where you say what you are trying to accomplish?

By simpler model, i mean we do not have to mirror all the method call quirks as 
patterns, only the ones that make sense from a "data oriented design" POV.

OK, but all your arguments are “against” — you just keep saying “we’re doing it 
wrong, w can make it simpler.”  You haven’t outline *how* it can be simpler, or 
what problems you are actually worried about.  Which makes them pretty hard to 
understand, or respond to.

Re: It's the data, stupid !

2022-05-30 Thread Brian Goetz





The problem is that what you propose is a leaky abstraction, because 
pattern matching works on classes and not on types, so it's not a 
reverse link.


("Leaky abstraction" is sort of an inflammatory term.)

What I think you're getting at is that some objects will have state that 
you can "put in", but can't "take out".  The mathematical relationship 
here is "embedding projection pair" (this is similar to an adjoint 
functor pair in some ways.)


A good example of this relationship is int and Integer.  Every int 
corresponds to an Integer, and *almost* every Integer (except null) 
corresponds to an int.  Imagine there are two functions e : int -> 
Integer and p : Integer -> int, where p(null) = bottom. Composing 
e-then-p is an identity; composing p-then-e can lose some information, 
but we can characterize the information loss.  Records form this same 
relationship with their cartesian product space (assuming you follow the 
refined contract outlined in Record::equals).  When you have this 
relationship, you get some very nice properties, such as "withers" and 
serialization basically for free.  The relationship between a ctor and 
the corresponding dtor also has this structure.  So yes, going backwards 
is "lossy", but in a controlled way.  This turns out to be good enough 
for a lot of things.



Let say we have a class with two shapes/deconstruction

class A {
  deconstructor (B) { ... }
  deconstructor (C) { ... }
}

With the pattern A(D d), D is a runtime class not a type, you have no 
idea if it means

  instanceof A a && B b = a.deconstructor() && b instanceof D
or
  instanceof A a && C c = a.deconstructor() && c instanceof D


You can have types in the dtor bindings, just as you can have types in 
the constructor arguments.  Both may use the class type variables, as 
they are instance "members".


Unlike with a method call (constructor call) where the type of the 
arguments are available, with the pattern matching, you do not have 
the types of the arguments, only runtime classes to match.


This is where the rule of "downcast compatible" comes in.  We see this 
show up in GADT-like examples (the rules of which are next on our 
parade.)  For example, if we have:


    sealed class Node { }
    record IntNode(int x) implements Node { }

then when we switch on a Node:

    switch (aNode) {
    case IntNode n: ...
    }

we may conclude, in the consequent of the appropriate case, that T=int.  
(Read the Kennedy and Russo paper for details.)


Similarly, if we have:

    List list = ...

then when matching, we may conclude that if its an ArrayList, its an 
ArrayLIst:


    switch (list) {
    case ArrayList a: ...
    }

but could not say `case ArrayList`, because that is 
inconsistent with the target type.


So, while we can't necessarily distinguish between Foo and 
Foo because of erasure, that doesn't mean we can't use types; 
its just that we can't conclude things that the generic type system 
won't let us.


As i said, it's a question where OOP and DOD (data oriented design ?) 
disagree one with the other.


I don't think they disagree at all.  They are both useful tools for 
modeling things; one is good for modeling entities and processes, the 
other for modeling data, using a common vocabulary.  Our systems may 
have both!


And this is a problem specific to the deconstructor, for named pattern 
method, there is no such problem, obviously a user can add as many 
pattern methods he/she want.


Because there's no name, we are limited to overloads that are distinct 
up to erasure; constructors have the same restriction.



But for each way of putting together the data, there should be a
corresponding way to take it apart. 



if the pattern matching was a real inverse link, yes, maybe.


I think the word "real" is doing too much lifting in that sentence.

The problem is that varargs pattern can also recognizes a record with 
no record or class with a deconstructor with no varargs.


As can a constructor.



You think term of inverse function, we have varargs constructors so we 
should have varargs pattern, but a pattern is not an inverse function.


You are interpreting "inverse" too strictly -- and then attempting to 
use that to prematurely bury the concept.



We have the freedom to provide a simpler model.


I think the reality is that you want this to be a smaller, less 
ambitious feature than is being planned here.  That's a totally valid 
opinion!  But I think its just a difference of opinion on how much to 
invest vs how much we get out.


But by all means, try to outline (in a single mail, please) your vision 
for a simpler model.

Re: Named record pattern

2022-05-30 Thread Brian Goetz


Thank you so much for catching this.

On 5/30/2022 12:36 PM, Tagir Valeev wrote:

Hello!

I'm reading the spec draft near "14.30.1 Kinds of Patterns" [1] and I
wonder how the variable declared as named record pattern differs from
the variable declared in the type test pattern

Assuming record Point(int x, int y) {}

One can use a pattern like
   obj instanceof Point p
or use a pattern like
   obj instanceof Point(int x, int y) p
It looks like the variable 'p' should be quite similar in both cases. However:
- In the first case we are free to declare 'p' as final or not.


I must admit to being very surprised that you can do this at all!  I 
don't recall discussion on this, and had you asked me, I would have said 
that `final` has no place in type-test patterns.  Yet, I just tried it 
with jshell and it does work as you say.  I am surprised.


Can someone recall any discussion over this?  (Surely now someone will 
point me to where I agreed to this.)


Worse, it even works in switch labels!  This is definitely not what I 
had in mind.  Did this happen because we reused the local variable 
production for type patterns?  Since switch patterns are about to exit 
preview, I think we need to fix this ASAP, before switch exits preview.



It looks like,
"obj instanceof final Point(int x, int y) p" syntax is not allowed
which brings some asymmetry
  - In the first case I can use LOCAL_VARIABLE annotations like 'obj


This very question is why I would not have encouraged us to try to do 
this for type test patterns at all!



instanceof @Cartesian Point p'. It looks like I cannot do the same in
the second case, which is another asymmetry.


We definitely intended to not allow declaration annotations.  As to 
type-use annotations; well, that's a different problem, and I'm not 
quite sure what to do.  For sure, we are not going to amend the 
XxxTypeAnnotations attributes to reify the position of these 
annotations.  If we allow them and make them available only to 
annotations processors only, that's another kind of asymmetry, that 
someone else will complain about.



So if I want to upgrade the type test pattern on a record type to a
record pattern to match components, I need to give up some features
like finality and annotations. Is this intended?


It was not really intended that you got those features in the first place.

Re: It's the data, stupid !

2022-05-30 Thread Brian Goetz



First, i've overlook the importance of the record pattern as a check 
of the shape of the data.


Then if we say that data are more important than code and that the aim 
of the pattern matching is to detect changes of the shapes of the data,

it changes the usefulness of some features/patterns.


OK, now that I see what argument you are really winding up for, I think 
I'm going to disagree.  Yes, data-as-data is a huge benefit; it is 
something we were not so good at before, and something that has become 
more important over time.  That has motivated us to *prioritize* the 
data-centric features of pattern matching over more general ones, 
because they deliver direct value the soonest.  But if you're trying to 
leverage that into a "this is the only benefit" (or even the main 
benefit), I think that's taking it too far.


The truly big picture here is that pattern matching is the dual of 
aggregation.  Java gives us lots of ways to put things together 
(constructors, factories, builders, maybe some day collection literals), 
but the reverse of each of these is ad-hoc, different, and usually 
harder-to-use / more error-prone.  The big picture here is that pattern 
matching *completes the object model*, by providing the missing reverse 
link.  (In mathematical terms, a constructor and deconstructor (or 
factory and static pattern, or builder and "unbuilder", or collection 
literal and collection pattern) form an *embedding-projection pair*.)


Much of this was laid out in Pattern Matching in the Java Object Model:

https://github.com/openjdk/amber-docs/blob/master/site/design-notes/patterns/pattern-match-object-model.md

it makes the varargs pattern a kind of harmful, because it matches 
data of several shapes, so the code may still compile if the shape of 
the record/data-type change.


I think you've stretched your argument to the breaking point.  No one 
said that each pattern can only match *one* structure of data. But for 
each way of putting together the data, there should be a corresponding 
way to take it apart.


 - the varargs pattern can be emulated by an array pattern and it's 
even better because an array pattern checks that the shape is an array and


Well, we don't have array patterns yet either, but just as varargs 
invocation is shorthand for a manually created array, varargs patterns 
are shorthand for an explicit array pattern.


The result is that i'm not sure the vararg pattern is a target worth 
pursuing.


I think its fine to be "not sure", and its doubly fine to say "I'm not 
sure the cost-benefit is so compelling, maybe there are other features 
that we should do first" (like array patterns.)  But if you're trying to 
make the argument that varargs patterns are actually harmful, you've got 
a much bigger uphill battle.


And don't forget, records are just the first vehicle here; this is 
coming for arbitrary classes too.  And being able to construct things 
via varargs construction, but not take them apart by varargs patterns, 
seems a gratuitous inconsistency.  (Again, maybe we decide that better 
type inference is worth doing first, but the lack of varargs will still 
be a wart.)


Deconstructors of a class also becomes a kind of a war ground between 
the OOP and the pattern matching, OOP says that API is important and 
pattern matching says it's ok to change the data changing the API 
because the compiler will points where the code should be updated.
We still want encapsulation because it's a class but we want to detect 
if its shape change so having a class with several shapes becomes not 
as useful as i first envision.


No, these are not in conflict at all.  The biggest tool OOP offers us is 
encapsulation; it gives us a way to decide how much state we want to 
expose, in what form, etc, fully decoupled from the representation.  
(Records don't have this option for decoupling representation from API, 
which is what makes it so easy to deliver these features first for 
records.)  Most classes still choose to give clients _some_ way to 
access most of the state we pass into the constructor and other API 
points; its just that this part of the API is usually gratuitously 
different (e.g., accessors, wrapping with Optional) from the part where 
state goes in.  Which means that we *do* expose the state to readers, 
just in a gratuitously different way that we do to writers.  What 
pattern matching does is gives us exactly the same control we have today 
over what to expose, and in what form, but lets us do it in a way that 
is structurally related to how we put state into objects.  It does so 
with combining multiple return, conditionality, and flow analysis in an 
integrated way, so we don't have to reinvent these in an ad-hoc way in 
every class.


So while we agree that records + sealed classes + pattern matching 
enable a nice form of data-oriented programming, and that was indeed a 
big goal, I think the model you're trying to extrapolate about what the 
"point" of pattern

Re: It's the data, stupid !

2022-05-30 Thread Brian Goetz

Indeed, this is a big part of the motivation.  And it's not just pattern 
matching; its the combination of records (representing data as data), 
sealed classes (the other half of algebraic data types, enabling richer 
data-as-data descriptions), and pattern matching (ad-hoc polymorphism, 
great for data).  The catchphrase we've been using in last few years has 
been make it easier to do "data-oriented programming" in Java.  This 
isn't a departure from OO, it's a recognition that not everything is 
best modeled as an stateful entity that communicates by sending and 
receiving messages.


Rolling back to the origin of these feature set (several years ago at 
this point), we observed that programs are getting smaller; monoliths 
give way to smaller services.  And the smaller the unit of code is, the 
closer it is to the boundary, at which it is exchanging messy untyped 
(or differently typed) data with the messy real world -- JSON, database 
result sets, etc.  (Gone are the days where it was Java objects all the 
way down, including across the wire boundary.)  We needed a simpler way 
to represent strongly typed ad-hoc data in Java, one that is easy to 
use, which can be easily mapped to and from the external messy formats 
at the boundary.  OO is great at defining and defending boundaries (it 
totally shines at platform libraries), but when it comes to modeling 
ordinary data, costs just as much but offers us less in return.  And 
pattern matching is key to being able to easily act on that data, take 
it apart, put it back together differently, etc.  The future 
installments of pattern matching are aimed at simplifying the code at 
that boundary; using pattern matching to mediate conversion from 
untyped, schema-free envelopes like JSON to 
illegal-states-are-unrepresentable data.


So, yes: records + sealed classes + pattern matching = embracing data as 
data.  I've got a piece I've been writing on this very topic, I'll send 
a link when its up.


And yes, we've been talking a lot about the details, because that's what 
this group is for.  But I don't think we have lost sight of the big 
picture.


Is there something you think we've missed?


On 5/30/2022 8:33 AM, Remi Forax wrote:

Hi all,
i think the recent discussions about the pattern matching are too much about 
details of implementation and i fear we are losing the big picture, so let me 
explain why i (we ?) want to add pattern matching to Java.

Java roots is OOP, encapsulation serves us well for the last 25+ years, it 
emphasis API above everything else, data are not important because it's just a 
possible implementation of the API.

But OOP / encapsulation is really important for libraries, less for 
applications. For an application, data are more important, or at least as 
important as API.

The goal of pattern matching is make data the center of the universe. Data are 
more important than code, if the data change because the business requirements 
change, the code should be updated accordingly. Pattern matching allows to 
write code depending on the data and it will fail to compile if the data 
change, indicating every places in the code where the code needs to be updated 
to take care of the new data shape.

The data can change in different ways,
  1) a new kind of a type (a subtype of an interface) can be introduced, we 
have added sealed types and make switch on type exhaustive so if a developer 
add a new subtype of an interface, the compiler will refuse to compile all 
patterns that are not exhaustive anymore, indicating that the code must be 
updated.
  2) a data can have a new field/component, we have introduced record pattern 
that match the exact shape of a record, so if a developer add a new component, 
the compiler will refuse to compile the record pattern with a wrong shape 
indicating that the code must be updated.

So experts, do you agree that this is what we want or did i miss something ?

Rémi

PS: the title is a nod to James Carville.

Refined type checking for GADTs (was: Pattern matching: next steps after JEP 405)

2022-05-24 Thread Brian Goetz





 - *Refined type checking for GADTs. *Given a hierarchy like:

    sealed interface Node { }
    record IntNode(int i) implements Node { }
    record FloatNode(float f) implements Node { }

we currently cannot type-check programs like:

     Node twice(Node n) {
        return switch (n) {
        case IntNode(int x) -> new IntNode(x*2);
            case FloatNode(float x) -> new FloatNode(x*2);
       }
   }

because, while the match constraints the instantiation of T in each 
arm of the switch, the compiler doesn't know this yet. 


Much of this problem has already been explored by "Generalized Algebraic 
Data Types and Object Oriented Programming" (Kennedy and Russo, 2005); 
there's a subset of the formalism from that paper which I think can 
apply somewhat cleanly to Java.


The essence of the approach is that in certain scopes (which coincide 
exactly with the scope of pattern binding variables), additional _type 
variable equality constraints_ are injected.  For a switch like that 
above, we inject a T=Integer constraint into the first arm, and a 
T=Float into the second arm, and do our type checking with these 
additional constraints.  (The paper uses equational constraints only 
(T=Integer), but we may want additional upper bounds as well (T <: 
Comprable)).


The way it works in this example is: we gather the constraint Node = 
Node from the switch (by walking up the hierarchy and doing 
substitution), and unifying, which gives us the new equational 
constraint T=Integer.  We then type-check the RHS using the additional 
constraints.


The type checking adds some new rules to reflect equational constraints, 
FJ-style:


   \Gamma |- T=U   \Gamma |- C OK
   - abstraction
   \Gamma |- C = C

   \Gamma |- C = C
   - reduction
   \Gamma |- T=U

   \Gamma |- X OK
   --  reflexivity
   \Gamma |- X=X

   \Gamma |- U=T
   -  symmetry
   \Gamma |- T=U

   \Gamma |- T=U  \Gamma |- U=V
     transitivity
   \Gamma |= T=V

    \Gamma |- T=U
    subtyping
   \Gamma |- T <: U

The key is that this only affects type checking; it doesn't rewrite any 
types.  Since in the first arm we are trying to assign a IntNode to a 
Node, and IntNode <: Node, by symmetry + subtyping, we get 
IntNode <: Node, and yay it type-checks.


The main moving parts of this sub-feature are:

 - Defining scopes for additional constraints/bounds.  This can 
piggyback on the existing language of the form "if v is introduced when 
P is true, then v is definitely matched at X"; we can trivially extend 
this to say "a constraint is definitely matched at X".  This is almost 
purely mechanical.
 - Defining additional type-checking rules to support scope-specific 
constraints, along the lines above, in 4.10 (Subtyping).
 - In the description of type and records patterns (14.30.x), appeal to 
inference to gather equational constraints, and which patterns introduce 
an equational constraint.


This is obviously only a sketch; more details to follow.

Re: Pattern matching: next steps after JEP 405

2022-05-20 Thread Brian Goetz

You are right that varargs records are dancing on the edge of a cliff.  
But (a) we have varargs records, and (b) array/varargs patterns are not 
only for records.


If you're arguing that they are not essential *right now* and can be 
deferred, that's a reasonable argument, but you'd have to actually make 
that argument.


But it seems you are arguing that array and varargs patterns are 
*fundamentally incoherent.*  This argument seems way overblown, and as 
you've seen, overblown arguments are usually counterproductive.






On 5/20/2022 8:46 AM, Remi Forax wrote:





*From: *"Brian Goetz" 
*To: *"amber-spec-experts" 
*Sent: *Wednesday, May 18, 2022 9:18:01 PM
*Subject: *Pattern matching: next steps after JEP 405

JEP 405 has been proposed to target for 19.  But, it has some
loose ends that I'd like to refine before it eventually becomes a
final feature.  These include:

[...]




 -*Varargs patterns. * Records can be varargs, but we have an
asymmetry where we can use varargs in constructors but not in
deconstruction.  This should be rectified.  The semantics of this
is straightforward; given

    record Foo(int x, int y, int... zs) { }

just as

    new Foo(x, y, z1, z2)

is shorthand for

    new Foo(x, y, new int[] { z1, z2 })

we also can express

    case Foo(var x, var y, var z1, var z2)

as being shorthand for

    case Foo(var x, var y, int[] { var z1, var z2 })

This means that varargs drags in array patterns.



Thinking a bit about the varargs pattern, introducing them is not a 
good idea because a varargs record is not a safe construct by default,
- varargs are arrays, and arrays are mutable in Java, so varargs 
records are not immutable by default

- equals() and hashCode() does not work as is too.

The record Foo should be written

  record Foo(int x, int y, ... zs) {
   Foo {
 zs = zs.clone();
   }

   public int[] zs() {
 return zs.clone();
   }

   public boolean equals(Object o) {
    return o instanceof Foo foo && x == foo.x && y == foo.y && 
Arrays.equals(zs, foo.zs);

   }

   public int hashCode() {
 return hash(x, y, Arrays.hashCode(zs));
   }
  }

Given that most people will forget that the default behavior of a 
varargs record is not the right one, introducing a specific pattern 
for varargs record to mirror them is like giving a gentle nudge to 
somebody on a cliff.


Note that, it does not mean that we will not support varargs record, 
because

one can still write either
  case Foo(int x, int y, int[] zs)

or
  case Foo(int x, int y, int[] { int... zs })   // or a similar syntax 
that mix a record pattern and an array pattern


but just that there will be no streamlined syntax for a varargs record.

Rémi

Re: Pattern matching: next steps after JEP 405

2022-05-20 Thread Brian Goetz

I'm sorry, I have no idea what argument you are trying to make.  Start 
from the beginning.


On 5/20/2022 1:27 AM, fo...@univ-mlv.fr wrote:





*From: *"Brian Goetz" 
*To: *"Remi Forax" 
*Cc: *"amber-spec-experts" 
*Sent: *Thursday, May 19, 2022 3:05:07 PM
*Subject: *Re: Pattern matching: next steps after JEP 405



When you have a type pattern X in a middle of a pattern *and*
you have conversions, then there is an ambiguity,
does instanceof Box(X x) means
  Box(var v) && v instanceof X x
or
  Box(var v) && X x = (X) v;


This is not an ambiguity in the language, it is confusion on the
part of the reader :)

In any case, I'm not following your argument here. 



If you have both a type pattern and allow conversions, you have
  Box(X) is equivalent to Box(var v) && v instanceof Y y && X x = (X) y

How do you find Y ?

And yes, the bar is not only that Y has to be unique for the compiler, 
it has also to be obvious for the human reader too.


Rémi

Re: Collections patterns

2022-05-20 Thread Brian Goetz





Or maybe you mean something else; if so, please share!


The current proposal is more about matching and extracting the first 
arguments


It is really about matching *the whole array*.   Pattern matching is 
about destructuring.  Arrays are part of the language.  They have 
structure.  We give people a way to make arrays by specifying all the 
elements; pattern matching deconstructs the array by matching all the 
elements.


than matching/extracting the last arguments or the rest are also 
useful IMO.
By example, if i want to parse command line arguments composed of 
options and a filename, i may want to write something like


  case [String... options, String filename] -> ...
  case [String...] -> help()


Just because something is useful doesn't mean it has an equal claim to 
be a language feature.  (Arrays are a language feature; inserting into 
the middle of a sequence is useful, but arrays don't offer that -- we 
have lists for that.  That doesn't make arrays broken.  Lists are a 
different feature.)

Re: Pattern matching: next steps after JEP 405

2022-05-19 Thread Brian Goetz





When you have a type pattern X in a middle of a pattern *and* you have 
conversions, then there is an ambiguity,

does instanceof Box(X x) means
  Box(var v) && v instanceof X x
or
  Box(var v) && X x = (X) v;


This is not an ambiguity in the language, it is confusion on the part of 
the reader :)


In any case, I'm not following your argument here.

Re: Collections patterns

2022-05-19 Thread Brian Goetz



We may want to extract sub-parts of the array / collections by 
example, and i would prefer to have the same semantics and a similar 
syntax.


This is pretty vague, so I'll have to guess about what you might mean.

Maybe you mean: "I want to match a list if it contains the a subsequence 
that matches this sequence of patterns", something like:


    [ ... p1, p2, ... ]

There is surely room to have APIs that query lists like this, but I 
think this is way out of scope for a pattern matching feature the 
language.  Pattern matching is about _destructuring_. (Destructuring can 
be conditional.)   An array is a linear sequence of elments; it can be 
destructured by a linear sequence of patterns.


Maybe you mean: "I want to decompose a list into the head element and a 
tail list".


In Haskell, we iterate a list by recursion:

    len :: [a] -> Int
    len [] = 0
    len x:xs = 1 + len xs

But again, this is *mere destructuring*, because the cons operator (:) 
is the linguistic primitive for aggregation, and [ ... ] lists are just 
sugar over cons.  So matching to `x:xs` is again destructuring.  We 
could try to apply this to Java, but it gets very clunky (inefficient, 
no tail recursion, yada yada) because our lists are *built 
differently*.  Further further, arrays are another step removed from 
lists even.


Or maybe you mean something else; if so, please share!

Re: Pattern matching: next steps after JEP 405

2022-05-18 Thread Brian Goetz





Inference is also something we will need for pattern assignment

  Box<>(var s) = box;


Yes, it would work the same in all pattern contexts -- instanceof as 
well.  Every pattern context has a match target whose static type is known.






 - *Array patterns. * The semantics of array patterns are a pretty
simple extension to that of record patterns; the rules for
exhaustiveness, applicability, nesting, etc, are a relatively
light transformation of the corresponding rules for record
patterns.  The only new wrinkle is the ability to say "exactly N
elements" or "N or more elements". 



I wonder if we should not at least work a little on patterns on 
collections, just to be sure that the syntax and semantics of the 
patterns on collections and patterns on arrays are not too dissimilar.


This is a big new area; collection patterns would have to be co-designed 
with collection literals, and both almost surely depend on some sort of 
type class mechanism if we want to avoid the feature being lame.  I 
don't think its realistic to wait this long, nor am I aiming at doing 
anything that looks like a generic array query mechanism.  Arrays have a 
first element, a second element, etc; the nesting semantics are very 
straightforward, and the only real question that needs additional 
support seems to be "match exactly N" or "match first N".






 - *Primitive patterns. *This is driven by another existing
asymmetry; we can use conversions (boxing, widening) when
constructing records, but not when deconstructing them.  There is
a straightforward (and in hindsight, obvious) interpretation for
primitive patterns that is derived entirely from existing cast
conversion rules. 



When calling a method / constructing an object, you can have several 
overloads so you need conversions, those conversions are known at 
compile time.
(Note that Java has overloads mostly because there is a rift between 
primitives and objects, if there was a common supertype, i'm not sure 
overloads will have carried their own weights.)


When doing pattern matching, there is no overloads otherwise you will 
have to decide which conversions to do at runtime.


There are no overloads YET, because the only deconstruction patterns are 
in records, and records have only one state description.  But that is a 
short-lived state of affairs.  When we do declared deconstruction 
patterns, we will need overload selection, and it will surely need to 
dualize the existing three-phase overload selection for constructors 
(e.g., loose, string, and varargs invocation contexts.)

Pattern matching: next steps after JEP 405

2022-05-18 Thread Brian Goetz

JEP 405 has been proposed to target for 19.  But, it has some loose ends 
that I'd like to refine before it eventually becomes a final feature.  
These include:


 - *Inference for record patterns. *Right now, we make you spell out 
the type parameters for a generic record pattern, such as:


    case Box(String s):

We also don't allow parameterizations that are not consistent with the 
match target.  While this is clear, it is verbose (and gets worse when 
there is nesting), and also, because of the consistency restriction, the 
parameterization is entirely inferrable.  So we would like to allow the 
parameters to be inferred.  (Further, inference is relevant to GADTs, 
because we may be able to gather constraints on the pattern from the 
nested patterns, if they provide some type parameter specialization.)



 - *Refined type checking for GADTs. *Given a hierarchy like:

    sealed interface Node { }
    record IntNode(int i) implements Node { }
    record FloatNode(float f) implements Node { }

we currently cannot type-check programs like:

     Node twice(Node n) {
        return switch (n) {
        case IntNode(int x) -> new IntNode(x*2);
            case FloatNode(float x) -> new FloatNode(x*2);
       }
   }

because, while the match constraints the instantiation of T in each arm 
of the switch, the compiler doesn't know this yet.



 -*Varargs patterns. * Records can be varargs, but we have an asymmetry 
where we can use varargs in constructors but not in deconstruction.  
This should be rectified.  The semantics of this is straightforward; given


    record Foo(int x, int y, int... zs) { }

just as

    new Foo(x, y, z1, z2)

is shorthand for

    new Foo(x, y, new int[] { z1, z2 })

we also can express

    case Foo(var x, var y, var z1, var z2)

as being shorthand for

    case Foo(var x, var y, int[] { var z1, var z2 })

This means that varargs drags in array patterns.


 - *Array patterns. * The semantics of array patterns are a pretty 
simple extension to that of record patterns; the rules for 
exhaustiveness, applicability, nesting, etc, are a relatively light 
transformation of the corresponding rules for record patterns.  The only 
new wrinkle is the ability to say "exactly N elements" or "N or more 
elements".



 - *Primitive patterns. * This is driven by another existing asymmetry; 
we can use conversions (boxing, widening) when constructing records, but 
not when deconstructing them.  There is a straightforward (and in 
hindsight, obvious) interpretation for primitive patterns that is 
derived entirely from existing cast conversion rules.



Obviously there is more we will want to do, but this set feels like what 
we have to do to "complete" what we started in JEP 405.  I'll post 
detailed summaries, in separate threads, of each over the next few days.

Re: [External] : Re: Record pattern and side effects

2022-05-06 Thread Brian Goetz





The accessor throws an exception and with the semantics you propose it 
will happily be wrapped by as many MatchExceptions as possible.


But this class is already so deeply questionable, because accessors 
should not throw exceptions, that we've lost the game before we 
started.  Whether the exception comes wrapped or not is like asking "do 
you want whipped cream on your mud and axle-grease pie" :)


(Also, I don't see where the exception is wrapped multiple times? So I'm 
not even sure you are clear on what is being propsed.)





public sealed interface RecChain {
  default int size() {
    return switch (this) {
  case Nil __ -> 0;
  case Cons(var v, var s, var next) -> 1 + next.size();
    };
  }

  record Nil() implements RecChain { }

  record Cons(int value, int size, RecChain next) implements RecChain {
    @Override
    public int size() {
  return size != -1? size: RecChain.super.size();
    }
  }

  public static void main(String[] args){
    RecChain chain = new Nil();
    for (var i = 0; i < 100; i++) {
  chain = new Cons(i, -1, chain);
    }
    System.out.println(chain.size());
  }
}


Rémi

Re: [External] : Re: Record pattern and side effects

2022-04-22 Thread Brian Goetz




Let's imagine that dtor D throws.  The wrapping happens when a
dtor/accessor is invoked _implicitly_ as a result of evaluating a
pattern match.  In both cases, we will wrap the thrown exception
and throw MatchException.  In this way, both instanceof and switch
are "clients of" pattern matching, and it is pattern matching that
throws.

I don't see any destruction here. 



I'm thinking about the refactoring from a code using accessors to a 
code using a deconstructor.

By example, IDEs may propose to refactor this code

  if (x instanceof D d) A(d.p()); else B;

to

  if (x instanceof D(P p)) A(p); else B;

or vice versa
If you wraps deconstructor exceptions, but not accessor exceptions you 
have mismatch.


OK, sure.  This bothers me zero.  Having an accessor (or dtor) throw is 
already really^3 weird; having a program depend on which specific 
exception it throws is really^32 weird.  (In both cases, they still 
throw an exception that you probably shouldn't be catching, with a clear 
stack trace explaining where it went wrong.)  Not a case to design the 
language around.


Still not seeing any "destruction" here.

Re: [External] : Re: Record pattern and side effects

2022-04-21 Thread Brian Goetz




We’ve already asked one of the questions on side effects (though not sure we
agreed on the answer): what if the dtor throws?  The working story is that the
exception is wrapped in a MatchException.  (I know you don’t like this, but
let’s not rehash the same arguments.)

Wrapping exceptions into a MatchException destroy any idea of refactoring from 
a cascade of if ... instanceof to a switch.

I think refactoring is a use case we should support.


Wrapping exceptions thrown from dtors does not affect refactoring.

If I have:

    if (x instanceof D(P)) A;
    else if (x instanceof D(Q)) B;
    else C;

and I refactor to

    switch (x) {
    case D(P): A; break;
    case D(Q): B; break;
    default: C
    }

Let's imagine that dtor D throws.  The wrapping happens when a 
dtor/accessor is invoked _implicitly_ as a result of evaluating a 
pattern match.  In both cases, we will wrap the thrown exception and 
throw MatchException.  In this way, both instanceof and switch are 
"clients of" pattern matching, and it is pattern matching that throws.


I don't see any destruction here.

Re: [External] : Re: case null / null pattern (v2)

2022-04-21 Thread Brian Goetz



"case Foo fooButNull" is equivalent to "case null" but with a binding 
typed as Foo that's why i ask if it should even compile,

the compiler should ask for an explicit "case null".


It may be "equivalent" in our eyes, but the language doesn't currently 
incorporate nullity into the type system.  So it is similar to other 
kinds of "equivalences", where the human knows more than the language, 
such as single-sealed classes:


    sealed interface I permits A { }
    final class A implements I { }

    I i = ...
    A a = i

As humans, we know that the I must be an A, but we'd have to change the 
set of assignment (and other) conversions to reflect that.  We could, 
but right now, the language doesn't know this.


I could construct other examples of things the programmer knows but the 
language doesn't, but I think you get my point -- if you want to raise 
this equivalence into the language, this is not merely about pattern 
dominance, this is an upgrade to the type system.




I'm not sure it should be include in the dominance, because Foo(int) 
is a subtype of Foo, so

  case Foo foo ->
  case Foo(int _) foo ->
should not compile.


Dominance already handles this case.  A type pattern `X x` dominates a 
record pattern `X(...)`.

Re: case null / null pattern (v2)

2022-04-19 Thread Brian Goetz

With the currently specified semantics, the second pattern is dead, 
because switches will only match null at the top level with a case 
null.  This was an accommodation to clarify that that the null-hostility 
of switch is a property of switch, not patterns, and make it more clear 
when switch will NPE.


Regardless, what you're asking for is a more precise remainder 
checking.  The first pattern matches all non-null Foo; because no case 
matches null, you're asking that we recognize that there is a dominance 
relationship here.  This is reasonable to consider (though harder, 
because null makes everything harder.)




On 4/18/2022 6:49 PM, Remi Forax wrote:

I've found a way to encode the null pattern if you have a record

record Foo(int x) { }

Foo foo = ...
return switch(foo) {
   case Foo(int _) foo -> "i'm a foo not null here !";
   case Foo fooButNull -> "i can be only null here !";
};

I wonder if allowing those two patterns, a record pattern and a type pattern 
using the same type is a good idea or not, it seems a great way to obfuscate 
thing.

Rémi

Re: Record pattern and side effects

2022-04-17 Thread Brian Goetz

Yes, this is something we have to get “on the record”.  

Record patterns are a special case of deconstruction patterns; in general, we 
will invoke the deconstructor (which is some sort of imperative code) as part 
of the match, which may have side-effects or throw exceptions.  With records, 
we go right to the accessors, but its the same game, so I’ll just say “invoke 
the deconstructor” to describe both.  

While we can do what we can to discourage side-effects in deconstructors, they 
will happen.  This raises all sorts of questions about what flexibility the 
compiler has.  

Q: if we have 

case Foo(Bar(String s)):
case Foo(Bar(Integer i)):

must we call the Foo and Bar deconstructors once, twice, or “dealer’s choice”?  
(I know you like the trick of factoring a common head, and this is a good 
trick, but it doesn’t answer the general question.)  

Q: To illustrate the limitations of the “common head” trick, if we have 

case Foo(P, Bar(String s)):
case Foo(Q, Bar(String s)):

can we factor a common “tail”, where we invoke Foo and Bar just once, and then 
use P and Q against the first binding?  

Q: What about reordering?  If we have disjoint patterns, can we reorder:

case Foo(Bar x): 
case TypeDisjointWithFoo t: 
case Foo(Baz x): 

into 

case Foo(Bar x): 
case Foo(Baz x): 
case TypeDisjointWithFoo t: 

and then fold the head, so we only invoke the Foo dtor once?

Most of the papers about efficient pattern dispatch are relatively little help 
on this front, because the come with the assumption of purity / 
side-effect-freedom.  But it seems obvious that if we were trying to optimize 
dispatch, our cost model would be something like arithmetic op << type test << 
dtor invocation, and so we’d want to optimize for minimizing dtor invocations 
where we can.  

We’ve already asked one of the questions on side effects (though not sure we 
agreed on the answer): what if the dtor throws?  The working story is that the 
exception is wrapped in a MatchException.  (I know you don’t like this, but 
let’s not rehash the same arguments.)  

But, exceptions are easier than general side effects because you can throw at 
most one exception before we bail out; what if your accessor increments some 
global state?  Do we specify a strict order of execution?  

You are appealing to a left-to-right constraint; this is a reasonable thing to 
consider, but surely not the only path.  But I think its a lower-order bit; the 
higher-order bit is whether we are allowed to, or required to, fold multiple 
dtor invocations into one, and similarly whether we are allowed to reorder 
disjoint cases.  

One consistent rule is that we are not allowed to reorder or optimize anything, 
and do everything strictly left-to-right, top-to-bottom.  That would surely be 
a credible answer, and arguably the answer that builds on how the language 
works today.  But I don’t like it so much, because it means we give up a lot of 
optimization ability for something that should never happen.  (This relates to 
a more general question (again, not here) of whether a dtor / declared pattern 
is more like a method (as in Scala, returning Option[Tuple]) or “something 
else”.  The more like a method we tell people it is, the more pattern 
evaluation will feel like method invocation, and the more constrained we are to 
do things strictly top-to-bottom, left-to-right.)  

Alternately, we could let the language have freedom to “cache” the result of 
partial matches, where if we learned, in a previous case, that x matches 
Foo(STUFF), we can reuse that judgment.  And we can go further, to require the 
language to cache in some cases and not in others (I know this is your 
preferred answer.)  

> On Apr 17, 2022, at 5:48 AM, Remi Forax  wrote:
> 
> This is something i think we have no discussed, with a record pattern, the 
> switch has to call the record accessors, and those can do side effects,
> revealing the order of the calls to the accessors.
> 
> So by example, with a code like this
> 
>  record Foo(Object o1, Object o2) {
>public Object o2() {
>  throw new AssertionError();
>}
>  }
> 
>  void int m(Foo foo) {
>return switch(foo) {
>  case Foo(String s, Object o2) -> 1
>  case Foo foo -> 2
>};
>  }
> 
>  m(new Foo(3, 4));   // throw AssertionError ?
> 
> Do the call throw an AssertionError ?
> I believe the answer is no, because 3 is not a String, so Foo::o2() is not 
> called.
> 
> Rémi

Re: case null vs null pattern

2022-04-16 Thread Brian Goetz

This is correct; I agree this is “not quite where we want to be yet”, but the 
path to get there is not obvious, which is why we haven’t proposed anything 
more than we have.  

At some level (though this isn’t the whole story), the “null pattern” is in the 
same limbo as constant patterns.  Constant case labels are currently still just 
case labels, not patterns, so you can say `case 1` but cannot yet say `case 
Foo(1)`.  The semantics of constant patterns are trivial, but we’re still not 
sure what the best way to surface it is.  (This is mostly a syntax bike shed, 
but I don’t want to paint it now.)   So when constant patterns come to the top 
of the priority list, null will be waiting there along with “foo” and 42.  

Still, that doesn’t get us out of the woods; we can express “Foo or null” with 
a binding in switches, but not in instanceof.  (Truly, null is the gift that 
keeps on giving.)  Maybe the way out of that is nullable type patterns (`T?`), 
but we’re not ready to go there yet either.  

In the meantime, nulls in switch can avail themselves of the same workaround 
that other constant patterns do: `case Foo(var x) when x == null`.  

> On Apr 16, 2022, at 8:57 AM, Remi Forax  wrote:
> 
> Hi all,
> i maybe wrong but it seems that the spec consider null as a kind of case 
> instead of as a kind of pattern, which disables the refactoring that should 
> be possible with the introduction of the record pattern.
> 
> Let suppose i have a sealed type with only one implementation declared like 
> this
> 
>  sealed interface I {
>record A() implements I { }
>  }
> 
> if null is part of the set of possible values, i have switches like this
> 
>  switch(i) {
>case null, A a -> // do something with 'a' here
>  }
> 
> 
> Now we are introducing record pattern into the mix, so i can have a Box of I,
>  record Box(I i) { }
> 
> the problem is that i can not write
>  switch(box) {
>case Box(A|null a) -> // do something with 'a' here
>  }
> 
> because null is handled as a kind of case instead of as a kind of a null 
> pattern.
> 
> Should we introduce a null pattern instead of having a specific "case null" ?
> (and disallow the null pattern in an instanceof ?)
> 
> regards,
> Rémi

Re: Evolving past reference type patterns

2022-04-16 Thread Brian Goetz

> On Apr 15, 2022, at 10:10 PM, Guy Steele  wrote:
> 
> That said, I am always (or at least now) a bit leery of language designers 
> motivating a new language feature by pointing out that it would make a 
> compiler easier to write. As I have learned the hard way on more than one 
> language project, compilers are not always representative of typical 
> application code. (Please consider this remark as only very minor pushback on 
> the form of the argument.)

Indeed, this is something to be vigilant for.  In fact, one could make this 
observation about pattern matching in entirety!  Pattern matching is a feature 
that all compiler writers love, because compilers are mostly just big 
tree-transforming machines, and so *of course* compiler writers see it as a way 
to make their lives easier.  (Obviously other programmers besides compiler 
writers like it too.)  

So, let me remind everyone why we are doing pattern matching, and it is not 
just “they’re better than visitors.”  We may be doing this incrementally, but 
there’s a Big Picture motivation here, let me try to tell some more of it.  

Due to trends in hardware and software development practices, programs are 
getting smaller.  It may be a cartoonish exaggeration to say that monoliths are 
being replaced by microservices, but the fact remains: units of deployment are 
getting smaller, because we’ve discovered that breaking things up into smaller 
units with fewer responsibilities offers us more flexibility.  Geometrically, 
when you shrink a region, the percentage of that region that is close to the 
boundary goes up.  

And so a natural consequence of this trend towards smaller deployment units is 
that more code is close to the boundary with the outside world, and will want 
to exchange data with the outside world.  In the Monolith days, “data from the 
outside world” was as likely as not to be a serialized Java object, but today, 
it is likely to be a blob of JSON or XML or YAML or a database result set.  And 
this data is at best untyped relative to Java’s type system.  (JSON has 
integers, but they’re not constrained to the range of Java’s int, etc.)  

At the boundary, we have to deal with all sorts of messy stuff: IO errors, bad 
data, etc.  But Java developers want to represent data using clean, statically 
typed objects with representational invariants.  In a Big Monolith, where most 
of the code is in the interior, it was slightly more tolerable to have big 
piles of conversion code at the boundary.  But when all the code lives a short 
hop from the boundary, our budget for adaptation to a more amenable format is 
smaller.  Records and sealed types let us define ad-hoc domain models; pattern 
matching lets us define polymorphism over those ad-hoc data models, as well as 
more general ad-hoc polymorphism.  Records, sealed types, and pattern matching 
let us adapt over the impedance mismatch between Java’s type system and messy 
stuff like JSON, at a cost we are all willing to pay.  

And it extends beyond simple patterns like we’ve seen so far; the end goal of 
this exercise is to use pattern matching to decompose complex entities like 
JSON blobs in a compositional manner — essentially defining the data boundary 
in one swoop, like an adapter that is JSON-shaped on one side and Java-shaped 
on the other. (We obviously have a long way to go to get there.)

Re: [External] : Re: Evolving past reference type patterns

2022-04-15 Thread Brian Goetz

Also, the following would be an error, even though the two are naturally 
dual:


record Foo(int x) { }

Foo f = new Foo(aShort);

if (f instanceof Foo(short x)) { ... }  // would be an error without 
`short x` applicable to int




On 4/15/2022 6:25 PM, Brian Goetz wrote:




Can you provides examples of such refactorings ?



Refactoring

    int x = aShort;
    foo(x, x);

to

    let int x = aShort
    in foo(x, x);

Re: [External] : Re: Evolving past reference type patterns

2022-04-15 Thread Brian Goetz





Can you provides examples of such refactorings ?



Refactoring

    int x = aShort;
    foo(x, x);

to

    let int x = aShort
    in foo(x, x);

Re: Evolving past reference type patterns

2022-04-15 Thread Brian Goetz


  * asking if something fits in the range of a byte or int; doing this
by hand is annoying and error-prone
  * asking if casting from long to int would produce truncation; doing
this by hand is annoying and error-prone


Here’s some real code I wrote recently that would benefit dramatically 
from this:


|default CodeBuilder constantInstruction(int value) { return with(switch 
(value) { case -1 -> ConstantInstruction.ofIntrinsic(Opcode.ICONST_M1); 
case 0 -> ConstantInstruction.ofIntrinsic(Opcode.ICONST_0); case 1 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_1); case 2 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_2); case 3 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_3); case 4 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_4); case 5 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_5); default -> { if (value 
>= -128 && value <= 127) { yield 
ConstantInstruction.ofArgument(Opcode.BIPUSH, value); } else if (value 
>= -32768 && value <= 32767) { yield 
ConstantInstruction.ofArgument(Opcode.SIPUSH, value); } else { yield 
ConstantInstruction.ofLoad(Opcode.LDC, 
BytecodeHelpers.constantEntry(constantPool(), value)); } } }); } |


could become the less error-prone and uniform:

|default CodeBuilder constantInstruction(int value) { return with(switch 
(value) { case -1 -> ConstantInstruction.ofIntrinsic(Opcode.ICONST_M1); 
case 0 -> ConstantInstruction.ofIntrinsic(Opcode.ICONST_0); case 1 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_1); case 2 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_2); case 3 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_3); case 4 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_4); case 5 -> 
ConstantInstruction.ofIntrinsic(Opcode.ICONST_5); case byte value -> 
ConstantInstruction.ofArgument(Opcode.BIPUSH, value); case short value 
-> ConstantInstruction.ofArgument(Opcode.SIPUSH, value); default -> 
ConstantInstruction.ofLoad(Opcode.LDC, 
BytecodeHelpers.constantEntry(constantPool(), value)); }); } |

Evolving past reference type patterns

2022-04-15 Thread Brian Goetz

We characterize patterns by their /applicability/ (static type 
checking), /unconditionality/ (can matching be determined without a 
dynamic check, akin to the difference between a static and dynamic 
cast), and /behavior/ (under what conditions does it match, and what 
bindings do we get.)



   Currently shipping

As currently shipping, we have one kind of pattern: type patterns for 
reference types. We define the useful term “downcast convertible” to 
mean there is a cast conversion that is not unchecked. So |Object| and 
|ArrayList| are downcast-convertible to each other, as are |List| and 
|ArrayList|, as are |List| and |ArrayList|, but not 
|List| to |ArrayList|.


A type pattern |T t| for a ref type T is /applicable to/ a ref type U if 
U is downcast-convertible to T.


A type pattern |T t| is /unconditional/ on |U| if |U <: T|.

A type pattern |T t| matches a target x when the pattern is 
unconditional, or when |x instanceof T|; if so, its binding is |(T) x|.



   Record patterns

In the next round, we will add /record patterns/, which bring in /nested 
patterns/.


A record pattern |R(P*)| is applicable to a reference type U if U is 
downcast-convertible to R. A record pattern is never unconditional.


A record pattern |R(P*)| matches a target |x| when |x instanceof R|, and 
when each component of |R| matches the corresponding nested pattern 
|P_i|. Matching against components is performed using the /instantiated/ 
static type of the component.


Record patterns also drag in primitive patterns, because records can 
have primitive components.


A primitive type pattern |P p| is applicable to, and unconditional on, 
the type P. A primitive type matches a target x when the pattern is 
unconditional, and its binding is |(P) x|.


Record patterns also drag in |var| patterns as nested patterns. A |var| 
pattern is applicable to, and unconditional on, every type U, and its 
binding when matched to |x| whose static type is |U|, is |x| (think: 
identity conversion.)


This is what we intend to specify for 19.


   Primitive patterns

Looking ahead, we’ve talked about how far to extend primitive patterns 
beyond exact matches. While I know that this makes some people 
uncomfortable, I am still convinced that there is a more powerful role 
for patterns to play here, and that is: as the cast precondition.


A language that has casts but no way to ask “would this cast succeed” is 
deficient; either casts will not be used, or we would have to tolerate 
cast failure, manifesting as either exceptions or data loss / 
corruption. (One could argue that for primitive casts, Java is deficient 
in this way now (you can make a lossy cast from long to int), but the 
monomorphic nature of primitive types mitigates this somewhat.) Prior to 
patterns, users have internalized that before a cast, you should first 
do an |instanceof| to the same type. For reference types, the 
|instanceof| operator is the “cast precondition” operator, with an 
additional (sensible) opinion that |null| is not deemed to be an 
instance of anything, because even if the cast were to succeed, the 
result would be unlikely to be usable as the target type.


There are many types that can be cast to |int|, at least under some 
conditions:


 * Integer, except null
 * byte, short, and char, unconditionally
 * Byte, Short, and Character, except null
 * long, but with potential loss of precision
 * Object or Number, if it’s not null and is an Integer

Just as |instanceof T| for a reference type T tells us whether a cast to 
T would profitably succeed, we can define |instanceof int| the same way: 
whether a cast to int would succeed without error or loss of precision. 
By this measure, |instanceof int| would be true for:


 * any int
 * Integer, when the instance is non-null (unboxing)
 * Any reference type that is cast-convertible to Integer, and is
   |instanceof Integer| (unboxing)
 * byte, short, and char, unconditionally (types that can be widened to
   int)
 * Byte, Short, and Character, when non-null (unboxing plus widening)
 * long when in the range of int (narrowing)
 * Long when non-null, and in the range of int (unboxing plus narrowing)

This table can be generated simply by looking at the set of cast 
conversions — and we haven’t talked about patterns yet. This is simply 
the generalization of |instanceof| to primitives. If we are to allow 
|instanceof int| at all, I don’t think there is really any choice of 
what it means. And this is useful in the language we have today, 
separate from patterns:


 * asking if something fits in the range of a byte or int; doing this
   by hand is annoying and error-prone
 * asking if casting from long to int would produce truncation; doing
   this by hand is annoying and error-prone

Doing this means that

|if (x instanceof T) ... (T) x ... |

becomes universally meaningful, and captures exactly the preconditions 
for when the cast succeeds without error, loss of precision, or null 
escape.

Re: [External] : Re: Primitive type patterns

2022-04-07 Thread Brian Goetz






And, why would we not want duality with:

    record R(short s) { }
    ...
    new R(x) 



because new R(x) is alone while case R(...) is part of a larger set of 
patterns/sub-pattern of the pattern matching, if for each 
pattern/sub-pattern, we need a double-entry table to understand the 
semantics, we are well past our complexity budget.




Again: it's the *same table* as all the other conversion contexts. To do 
something different here is what would be incremental complexity.


Let me try this one more time, from a third perspective.

What does `instanceof` do?  The only reasonable thing you would do after 
an `instanceof` is a cast.  Asking `instanceof` is asking, if I were to 
cast to this type, would I like the outcome?  Instanceof currently works 
only on reference types, and says no when (a) the cast would fail with 
CCE, and (b) if the operand is null, because, even though the cast would 
succeed, the next operation likely would not.


Instanceof may be defined only on references now, but casting primitives 
is excruciatingly defined with the same full double-entry table that you 
don't like.


So, when asking "what does a primitive type pattern mean", then, we can 
view this as the generalization of the same question: if I were to cast 
this target to this type, would I like the outcome?  The set of things 
that can go wrong (outcomes we wouldn't like) is slightly broader here; 
we can cast a long to int, and we'd get truncation, but that's a bad 
outcome, just like CCE.  If we interpret `instanceof` as the 
precondition test for a useful cast, we again get the same set of 
conversions:


 - I can safely cast a non-null Integer to int (unboxing with null check)
 - I can safely cast any int to Integer (boxing)
 - I can safely cast an int in the range of -128..127 to byte 
(narrowing with value range check)

 - I can safely cast any byte to int (widening)
 - etc.

It's a third different explanation, and yet it still comes up with 
*exactly the same set of conversions* as the other two explanations 
(assignment and method invocation).  There's a reason all roads lead 
here; because the conversion sets are defined in a consistent manner.  
Doing something *different* with pattern matching would be the new 
complexity.

Re: [External] : Re: Primitive type patterns

2022-04-07 Thread Brian Goetz



We already discussed those rules when we discuss instanceof,  it means 
that "x instanceof primitive" has different meaning depending on the 
type of x


No, it does not.  It means "does x match the pattern P" everywhere. It 
is pattern P is that has different meanings depending on type. This may 
sound like a silly distinction, but it is not!  Pattern matching is 
inherently polymorphic -- it is all about reflecting dynamic conversions 
statically -- and exhibits the *same* polymorphism regardless of where 
it occurs.



  Object x = ...
  if (x instanceof short) { ... }   // <=> instanceof Short + unboxing


Where x is a reference type, yes.  How is that not a reasonable question 
to ask?   And, why would we not want duality with:


    record R(short s) { }
    ...
    new R(x)

Asking `r instanceof R(short s)` is akin to asking "could this R have 
come from passing some x to the R constructor".   And that would work if 
x were byte, or x were a long literal in the value range of short, or a 
Short, or



It's also another creative way to have an action at distance,


More like *the same* way.

Anyway, your opposition to the entirety of this sub-feature is noted.

Re: Primitive type patterns

2022-04-07 Thread Brian Goetz

There's another, probably stronger, reason why primitive patterns 
supporting widening, narrowing, boxing, unboxing, etc, are basically a 
forced move, besides alignment with `let` statements, discussed earlier:


There is another constraint on primitive type patterns: the let/bind 
statement coming down the road.  Because a type pattern looks (not 
accidentally) like a local variable declaration, a let/bind we will 
want to align the semantics of "local variable declaration with 
initializer" and "let/bind with total type pattern". Concretely:


    let String s = "foo";

is a pattern match against the (total) pattern `String s`, which 
introduces `s` into the remainder of the block.  Since let/bind is a 
generalization of local variable declaration with initialization, 
let/bind should align with locals where the two can express the same 
thing.  This means that the set of conversions allowed in assignment 
context (JLS 5.2) should also be supported by type patterns.


The other reason is the duality with constructors (and eventually, 
methods).  if we have a record:


 record R(int x, Long l) { }

we can construct an R with

    new R(int, Long)   // no adaptation
    new R(Integer, Long)   // unbox x
    new R(short, Long)  // widen x
    new R(int, long)   // box y

Deconstruction patterns are the dual of constructors; we should be able 
to deconstruct an R with:


    case R(int x, Long l)    // no adaptation
    case R(Integer x, Long l)    // box x
    case R(short s, Long l)   // range check
    case R(int x, long l)    // unbox y, null check

So the set of adaptations in method invocation context should align with 
those in nested patterns, too.

Re: Draft Spec for Third Preview of Pattern Matching for Switch and Record Patterns (JEP 405) now available

2022-04-07 Thread Brian Goetz




http://cr.openjdk.java.net/~gbierman/PatternSwitchPlusRecordPatterns/PatternSwitchPlusRecordPatterns-20220407/specs/patterns-switch-jls.html 



Comments welcome!


The execution of an exhaustive|switch|can fail with a linkage error 
(an|IncompatibleClassChangeError|is thrown) if it encounters an instance 
of a permitted direct subclass that was not known at compile time 
(14.11.3 
,15.28.2 
). 
Strictly speaking, the linkage error is not flagging a binary 
incompatible change of the|sealed|class, but more accurately a/migration 
incompatible/change of the|sealed|class.


I think we should back away from ICCE here as well, and put this in the 
MatchException bucket too.  Then:


 - a switch throws NPE if the operand is null;
 - an _enum switch_ throws ICCE when encountering a novel constant;
 - all other remainder errors are MatchException.

File away for future use, that these clauses will have to be extended to 
include other exhaustive pattern-aware constructs, like let.



   14.11.1 Switch Blocks


The grammar for CaseOrDefaultLabel seems like it could be profitably 
refactored to reflect more of the restrictions:


    CaseOrDefaultLabel
    case (null | CaseConstant) {, CaseConstant }
    case [null, ] Pattern { WhenClause }
    case [null, ] default
    default

and then you don't have to enumerate as many of the restrictions of what 
can combine with what.


It is a compile-time error if a|when|expression has the value|false|.

... is a constant expression and has the value false ?

 *

   A pattern case element/p/is switch compatible with/T/if/p/is
   assignable to type/T/(14.30.3
   
).


Isn't this cast-convertible?  If the selector is String and the pattern 
is `Object o`, o is not assignable to String, but it is cast-convertible.


A switch label is said to/dominate/another switch label

Can we say that in a pattern switch, default dominates everything, which 
has the effect of forcing the default to the bottom?


if there are values for which both apply and there is not an obvious 
preference


Is this really what we mean?  Don't we really mean that the first one 
matches everything the second one does?


A set of case elements is exhaustive

This is a nit, but couldn't this be its own subsection?  This section is 
getting long and varied.


/T/if/T/is downcast convertible to/U/

Is this right?  Upcast convertibility is OK too -- you can match `Object 
o` to a target of `String`, and vice versa.


If the type/R/is a raw type (4.8 
) 
then the type/T/must be a raw type, or vice versa; otherwise a 
compile-time error occurs.


Is this the right restriction?  What we want here (for this iteration) 
is that if R is generic, we specify the type parameters. But this is not 
the same thing.  I would think we would want to say here something like 
"if the class of R is a generic class, R cannot be raw".


whose type names/R/

missing a word

1.

   A switch label that supports a pattern/p//applies/if the value
   matches/p/(14.30.2
   
).
   If pattern matching completes abruptly then determining which switch
   label applies completes abruptly for the same reason.


I think this is carried over from the previous round?  Or do we not 
resolve total type patterns to any at the top level of a switch?


2.

   If no|case|label matches but there is a|default|label, then
   the|default|label/matches/.*If neither of these rules apply to any
   of the switch labels in the switch block, then a switch label that
   supports a|default|applies.*


Don't we need a clause that says "if there is no default, a 
MatchException is thrown"?


*If pattern matching completes abruptly then the process of determining 
which switch label applies completes abruptly for the same reason.*


Doesn't it complete abruptly with MatchException?  Or can PM only 
complete abruptly with ME as well?


A type pattern that does not appear as an element in a record component 
pattern list is called a/top-level type pattern/.


For future: "or array component pattern list"

The pattern variable declared by an any pattern has a type, which is a 
reference type.


Is this still true?  What if I have `record R(int x) {}` and `case R(var 
x)`?  The type of x is not a reference type.  Same for `case R(int x)`.


A pattern/p/is said to

Re: [External] : Re: Remainder in pattern matching

2022-04-01 Thread Brian Goetz



It seems pretty hard to land anywhere other than where you've landed, 
for most of this. I have the same sort of question as Dan: do we 
really want to wrap exceptions thrown by other patterns? You say we 
want to discourage patterns from throwing at all, and that's a lovely 
dream, but the behavior of total patterns is to throw when they meet 
something in their remainder.


Not exactly.  The behavior of *switch* is to throw when they meet 
something in the remainder of *all their patterns*.  For example:


    Box> bbs = new Box(null);
    switch (bbs) {
    case Box(Box(String s)): ...
    case null, Box b: ...
    }

has no remainder and will not throw.  Box(null) doesn't match the first 
pattern, because when we unroll to what amounts to


    if (x instanceof Box alpha && alpha != null && alpha.value() 
instanceof Box beta && beta != null) {

    s = beta.value(); ...
    }
    else if (x == null || x instanceof Box) { ... }

we never dereference something we don't know to be non-null.  So 
Box(null) doesn't match the first case, but the second case gets a shot 
at it.  Only if no case matches does switch throw; *pattern matching* 
should never throw.  (Same story with let, except its like a switch with 
one putatively-exhaustive case.)


Since user-defined patterns will surely involve primitive patterns at 
some point, there is the possibility that one of those primitive 
patterns throws, which bubbles up as an exception thrown by a 
user-defined pattern.


Again, primitive patterns won't throw, they just won't match.  Under the 
rules I outlined last time, if I have:


    Box b = new Box(null);
    switch (b) {
    case Box(int x): ...
    ...
    }

when we try to match Box(int x) to Box(null), it will not NPE, it will 
just not match, and we'll go on to the next case.  If all cases don't 
match, then the switch will throw ME, which is a failure of 
*exhaustiveness*, not a failure in *pattern matching*.


Does this change your first statement?



On Wed, Mar 30, 2022 at 7:40 AM Brian Goetz  
wrote:


We should have wrapped this up a while ago, so I apologize for the
late notice, but we really have to wrap up exceptions thrown from
pattern contexts (today, switch) when an exhaustive context
encounters a remainder.  I think there's really one one sane
choice, and the only thing to discuss is the spelling, but let's
go through it.

In the beginning, nulls were special in switch.  The first thing
is to evaluate the switch operand; if it is null, switch threw
NPE.  (I don't think this was motivated by any overt null
hostility, at least not at first; it came from unboxing, where we
said "if its a box, unbox it", and the unboxing throws NPE, and
the same treatment was later added to enums (though that came out
in the same version) and strings.)

We have since refined switch so that some switches accept null. 
But for those that don't, I see no other move besides "if the
operand is null and there is no null handling case, throw NPE." 
Null will always be a special remainder value (when it appears in
the remainder.)

In Java 12, when we did switch expressions, we had to confront the
issue of novel enum constants.  We considered a number of
alternatives, and came up with throwing ICCE.  This was a
reasonable choice, though as it turns out is not one that scales
as well as we had hoped it would at the time.  The choice here is
based on "the view of classfiles at compile time and run time has
shifted in an incompatible way."  ICCE is, as Kevin pointed out, a
reliable signal that your classpath is borked.

We now have two precedents from which to extrapolate, but as it
turns out, neither is really very good for the general remainder
case.

Recall that we have a definition of _exhaustiveness_, which is, at
some level, deliberately not exhaustive. We know that there are
edge cases for which it is counterproductive to insist that the
user explicitly cover, often for two reasons: one is that its
annoying to the user (writing cases for things they believe should
never happen), and the other that it undermines type checking (the
most common way to do this is a default clause, which can sweep
other errors under the rug.)

If we have an exhaustive set of patterns on a type, the set of
possible values for that type that are not covered by some pattern
in the set is called the _remainder_.  Computing the remainder
exactly is hard, but computing an upper bound on the remainder is
pretty easy.  I'll say "x may be in the remainder of P* on T" to
indicate that we're defining the upper bound.

 - If P* contains a deconstruction pattern P(Q*), null may be in
the remainder of P*.
 - If T is sealed, instances of a novel subtype of T may be in the
rem

Re: [External] : Re: Pattern assignment

2022-04-01 Thread Brian Goetz



I'm certainly on board with a pattern-matching context that doesn't 
require a vacuous conditional. Remainder, as it often does to me, 
seems like the most likely point of confusion, but if we believe Java 
developers can get their heads around the idea of remainder in other 
contexts, I don't think this one is a novel problem.


Remainder is hard; the idea that our definition of "exhaustive" is 
intentionally defective is subtle, and will surely elicit "lol java" 
reactions among those disinclined to think very hard.  I wonder if a 
better term than "exhaustive" would help, one that doesn't promise so much.


I don't immediately see the benefit of partial patterns: why should I 
write


(I assume you mean you don't see the benefit of *let* with partial 
patterns, since if all patterns were total this would just be multiple 
return.)



let Optional.of(foo) = x;
else foo = defaultFoo;


Because of scoping, and because you can't have a pattern just write to a 
local, even a blank final.  (This could of course be made to work, but I 
would really rather avoid going there if we at all can. (Yes Remi, I 
know you're in favor of going there.))


when I could instead write (I assume blank finals are valid pattern 
variables?)


final Foo foo;
if (!(x instanceof Optional.of(foo))) foo = defaultFoo;


Not currently, and I'd like to avoid it.  One reason is that this looks 
too much like a factory invocation; another is that, if we ever have 
constant patterns, then it won't be clear whether `foo` above is a 
variable into which to write the answer, or a constant that is being 
matched to the result of the binding.  Both of these are fighting (with 
method invocation) for the concise syntax, and I'm not sure I want any 
of them to win, but they can't all win, and I am not ready to pick that 
winner yet.  But, we will probably have to confront this  in some form 
when we get to dtor declaration.


But yes, the main value of the `else` is so that bindings can be via a 
fallback path and be in scope for the rest of the method.  The rest of 
`else` and `when` is mostly along for the ride.  And its likely that we 
wouldn't do all these forms initially, but I wanted to sketch out the 
whole design space before doing anything.


Obviously it's shorter, but I'm not sure that's worth giving up the 
promised simplicity from earlier that `let` is for when "we know a 
pattern will always match".


OK, so you see this as being mostly "for unconditional patterns".

Let-expressions seem like a reasonable extension, though who knows how 
popular it will be. Of course, we could always generalize and add 
statement-expressions instead...alas, such a change will have to wait 
quite a while longer, I'm sure.


Let expressions would alleviate some but not all of the cases for which 
general statement-expressions would.  They are not quite as good for "f 
= new Foo(); f.setX(3); yield f;", but (IMO) better for pulling common 
subexpressions into variables whose scope is confined to the expression.


Did you consider allowing pattern parameters only in lambdas, not in 
methods in general? Since a lambda is generally "internal 
implementation" and a method is often API-defining, it might be 
reasonable to allow implementation details to leak into lambda 
definitions if it makes them more convenient to write, while keeping 
the more formal separation of implementation and API for method 
parameters.


Yes, but I didn't come up with a syntax I liked enough for both lambdas 
and let.  Perhaps I'll try some more.




On Fri, Mar 25, 2022 at 8:39 AM Brian Goetz  
wrote:


We still have a lot of work to do on the current round of pattern
matching (record patterns), but let's take a quick peek down the
road.  Pattern assignment is a sensible next building block, not
only because it is directly useful, but also because it will be
required for _declaring_ deconstruction patterns in classes
(that's how one pattern delegates to another.)  What follows is a
rambling sketch of all the things we _could_ do with pattern
assignment, though we need not do all of them initially, or even
ever.


# Pattern assignment

So far, we've got two contexts in the language that can
accommodate patterns --
`instanceof` and `switch`.  Both of these are conditional
contexts, designed for
dealing with partial patterns -- test whether a pattern matches,
and if so,
conditionally extract some state and act on it.

There are cases, though, when we know a pattern will always match,
in which case
we'd like to spare ourselves the ceremony of asking.  If we have a
3d `Point`,
asking if it is a `Point` is redundant and distracting:

```
Point p = ...
if (p instanceof Point(var x, var y, var z)) {
    // use x, y, z
}
```

In this situation, we're asking a question to which we know th

Re: [External] : Re: Remainder in pattern matching

2022-03-31 Thread Brian Goetz


Here's some candidate spec text for MatchException:


Prototype spec for MatchException ( a preview API class ).

Thrown to indicate an unexpected failure in pattern matching.

MatchException may be thrown when an exhaustive pattern matching 
language construct (such as a switch expression) encounters a match 
target that does not match any of the provided patterns at runtime.  
This can arise from a number of cases:


 - Separate compilation anomalies, where a sealed interface has a 
different set of permitted subtypes at runtime than it had at 
compilation time, an enum has a different set of constants at runtime 
than it had at compilation time, or the type hierarchy has changed in 
incompatible ways between compile time and run time.
 - Null targets and sealed types.  If an interface or abstract class 
`C` is sealed to permit `A` and `B`, then the set of record patterns 
`R(A a)` and `R(B b)` are exhaustive on a record `R` whose sole 
component is of type `C`, but neither of these patterns will match `new 
R(null)`.
 - Null targets and nested record patterns.  Given a record type `R` 
whose sole component is `S`, which in turn is a record whose sole 
component is `String`, then the nested record pattern `R(S(String s))` 
will not match `new R(null)`.


Match failures arising from unexpected inputs will generally throw 
`MatchException` only after all patterns have been tried; even if 
`R(S(String s))` does not match `new R(null)`, a later pattern (such as 
`R r`) may still match the target.


MatchException may also be thrown when operations performed as part of 
pattern matching throw an unexpected exception.  For example, pattern 
matching may cause methods such as record component accessors to be 
implicitly invoked in order to extract pattern bindings.  If these 
methods throw an exception, execution of the pattern matching construct 
may fail with `MatchException`.




On 3/30/2022 2:43 PM, Dan Heidinga wrote:

On Wed, Mar 30, 2022 at 2:38 PM Brian Goetz  wrote:

Another way to think about this is:

  - If any of the code that the user actually wrote (the RHS of case clauses, 
or guards on case labels) throws, then the switch throws that
  - If any of the machinery of the switch dispatch throws, it throws 
MatchException.


That's a reasonable way to factor this and makes the difference
between the machinery and the direct user code clear, even when
looking at stacktraces.

And from your other response:


Another thing it gains is that it discourages people
from thinking they can use exceptions in dtors; having these laundered
through MatchException discourages using this as a side channel, though
that's a more minor thing.

This is a stronger argument than you give it credit for being.
Wrapping the exception adds a bit of friction to doing the wrong thing
which will pay off in helping guide users to the intended behaviour.

--Dan


On 3/30/2022 2:12 PM, Dan Heidinga wrote:

The rules regarding NPE, ICCE and MatchException look reasonable to me.


As a separate but not-separate exception problem, we have to deal with at least 
two additional sources of exceptions:

  - A dtor / record acessor may throw an arbitrary exception in the course of 
evaluating whether a case matches.

  - User code in the switch may throw an arbitrary exception.

For the latter, this has always been handled by having the switch terminate 
abruptly with the same exception, and we should continue to do this.

For the former, we surely do not want to swallow this exception (such an 
exception indicates a bug).  The choices here are to treat this the same way we 
do with user code, throwing it out of the switch, or to wrap with 
MatchException.

I prefer the latter -- wrapping with MatchException -- because the exception is thrown 
from synthetic code between the user code and the ultimate thrower, which means the 
pattern matching feature is mediating access to the thrower.  I think we should handle 
this as "if a pattern invoked from pattern matching completes abruptly by throwing 
X, pattern matching completes abruptly with MatchException", because the specific X 
is not a detail we want the user to bind to.  (We don't want them to bind to anything, 
but if they do, we want them to bind to the logical action, not the implementation 
details.)

My intuition (and maybe I have the wrong mental model?) is that the
pattern matching calling a user written dtor / record accessor is akin
to calling a method.  We don't wrap the exceptions thrown by methods
apart from some very narrow cases (ie: reflection), and I thought part
of reflection's behaviour was related to needing to ensure exceptions
(particularly checked ones) were converted to something explicitly
handled by the caller.

If the dtor / record accessor can declare they throw checked
exceptions, then I can kind of see the rationale for wrapping them.
Otherwise, it seems clearer to me to let them be thrown without
wrapping.

I don't think we expect users to

Patterns and GADTs (and type checking and inference and overload selection)

2022-03-30 Thread Brian Goetz

GADTs -- sealed families whose permitted subtypes specialize the type 
variables of the base class -- pose some interesting challenges for 
pattern matching.


(Remi: this is a big, complex area.  Off-the-cuff "this is wrong" or 
"you should X instead" replies are not helpful.  If in doubt, ask 
questions.  One comprehensive reply is more useful than many small 
replies.  Probably best to think about the whole thing for some time 
before responding.)


Here is an example of a GADT hiearchy:

sealed interface Node { }

record IntNode(int i) implements Node { }
record BoolNode(boolean b) implements Node { }
record PlusNode(Node a, Node b) implements 
Node { }

record OrNode(Node a, Node b) implements Node { }
record IfNode(Node cond, Node a, Node b) implements 
Node { }


Nodes can be parameterized, but some nodes are sharply typed, and some 
intermediate nodes (plus, or, if) have constraints on their components.  
This is enough to model expressions like:


    let
   a = true, b = false, x = 1, y = 2
   in
   if (a || b) then a + b else b;

Note that `if` nodes can work on both Node and Node, 
and model a node of the right type.


## The Flow Problem

As mentioned earlier, pattern matching can recover constraints on type 
variables, but currently we do not act on these.  For example, we might 
want to write the eval() like this:


static T eval(Node n) {
    return switch (n) {
    case IntNode(var i) -> i;
    case BoolNode(var b) -> b;
    case PlusNode(var a, var b) -> eval(a) + eval(b);
    case OrNode(var a, var b) -> eval(a) || eval(b);
    case IfNode(var c, var a, var b) -> eval(c) ? eval(a) : eval(b);
    };

But this doesn't work.  The eval() method returns a T.  In the first 
case, we've matched Node to IntNode, so the compiler knows `i : 
int`.  Humans know that T can only be Integer, but the compiler doesn't 
know that yet.  As a result, the choice to return `i` will cause a type 
error; the compiler has no reason to believe that `i` is a `T`.  The 
only choice the user has is an unchecked cast to `T`.  This isn't great.


We've discussed, as a possible solution, flow typing for type variables; 
matching IntNode to Node can generate a constraint T=Integer in the 
scope where the pattern matches. Pattern matching is already an 
explicitly conditional construct; whether a pattern matches already 
flows into scoping and control flow analysis.  Refining type constraints 
on type variables is a reasonable thing to consider, and offers a 
greater type-safety payoff than ordinary flow typing (since most flow 
typing can be replaced with pattern matching.)


We have the same problem with the PlusNode and OrNode cases too; if we 
match PlusNode, then T can only be Integer, but the RHS will be int and 
assigning an int to a T will cause a problem.  Only the last case will 
type check without gathering extra T constraints.


## The Exhaustiveness Problem

Now suppose we have a Node.  Then it can only be an IntNode, a 
PlusNode, or an IfNode.  So the following switch should be exhaustive:


static int eval(Node n) {
    return switch (n) {
    case IntNode(var i) -> i;
    case PlusNode(var a, var b) -> eval(a) + eval(b);
    case IfNode(var c, var a, var b) -> eval(c) ? eval(a) 
: eval(b);

    };

We need to be able to eliminate BoolNode and OrNode from the list of 
types that have to be covered by the switch.


We're proposing changes in the current round (also covered in my 
Coverage doc) that refines the "you cover a sealed type if you cover all 
the permitted subtypes" rule to exclude those whose parameterization are 
impossible.


## The Typing Problem

Even without worrying about the RHS, we have problems with cases like this:

static T eval(Node n) {
    return switch (n) {
    ...
    case IfNode(var c, IntNode a, IntNode b) -> eval(c) ? 
a.i() + b.i(); // optimization

    };

We know that an IfNode must have the same node parameterization on both 
a and b.  We don't encourage raw IfNode here; there should be something 
in the .  The rule is that if a type / record pattern is 
generic, the parameterization must be statically consistent with the 
target type; there has to be a cast conversion without unchecked 
conversion.  (This can get refined if we get sufficiently useful 
constraints from somewhere else, but not relaxed.)  But without some 
inference, we can't yet conclude that Integer is a valid (i.e., won't 
require unchecked conversion) parameterization for Node.  But 
clearly, Integer is the only possibility here.  So we can't even write 
this -- we'd have to use a raw or wildcard case, which is not very 
good.  We need more inference here, so we have enough type information 
for better well-formedness checks.


 Putting it all together

Here's a related example from the "Lower your Guards" paper which ties 
it all together.  In Haskell:


data T a b where
T1 :: T Int Bool
T2 :: T Char Bool

g1 :: T Int b -> b -> Int
g1 T1

Re: [External] : Re: Remainder in pattern matching

2022-03-30 Thread Brian Goetz

Yes, and this is a special case of a more general thing -- that while 
pattern declarations may have a lot in common with methods, they are not 
"just methods with multiple return" (e.g., they have a different set of 
characteristics at the declaration, they are intrinsically conditional, 
they are "invoked" differently.)  While their bodies may look 
method-like, and ultimately they boil down to methods, thinking "they 
are just methods" is likely to drag you to the wrong place.  Of course, 
its a balance between how similar and HOW DIFFERENT they are, and that's 
what we're looking for.



Another thing it gains is that it discourages people
from thinking they can use exceptions in dtors; having these laundered
through MatchException discourages using this as a side channel, though
that's a more minor thing.

This is a stronger argument than you give it credit for being.
Wrapping the exception adds a bit of friction to doing the wrong thing
which will pay off in helping guide users to the intended behaviour.

Re: [External] : Re: Remainder in pattern matching

2022-03-30 Thread Brian Goetz


Another way to think about this is:

 - If any of the code that the user actually wrote (the RHS of case 
clauses, or guards on case labels) throws, then the switch throws that
 - If any of the machinery of the switch dispatch throws, it throws 
MatchException.


On 3/30/2022 2:12 PM, Dan Heidinga wrote:

The rules regarding NPE, ICCE and MatchException look reasonable to me.



As a separate but not-separate exception problem, we have to deal with at least 
two additional sources of exceptions:

  - A dtor / record acessor may throw an arbitrary exception in the course of 
evaluating whether a case matches.

  - User code in the switch may throw an arbitrary exception.

For the latter, this has always been handled by having the switch terminate 
abruptly with the same exception, and we should continue to do this.

For the former, we surely do not want to swallow this exception (such an 
exception indicates a bug).  The choices here are to treat this the same way we 
do with user code, throwing it out of the switch, or to wrap with 
MatchException.

I prefer the latter -- wrapping with MatchException -- because the exception is thrown 
from synthetic code between the user code and the ultimate thrower, which means the 
pattern matching feature is mediating access to the thrower.  I think we should handle 
this as "if a pattern invoked from pattern matching completes abruptly by throwing 
X, pattern matching completes abruptly with MatchException", because the specific X 
is not a detail we want the user to bind to.  (We don't want them to bind to anything, 
but if they do, we want them to bind to the logical action, not the implementation 
details.)

My intuition (and maybe I have the wrong mental model?) is that the
pattern matching calling a user written dtor / record accessor is akin
to calling a method.  We don't wrap the exceptions thrown by methods
apart from some very narrow cases (ie: reflection), and I thought part
of reflection's behaviour was related to needing to ensure exceptions
(particularly checked ones) were converted to something explicitly
handled by the caller.

If the dtor / record accessor can declare they throw checked
exceptions, then I can kind of see the rationale for wrapping them.
Otherwise, it seems clearer to me to let them be thrown without
wrapping.

I don't think we expect users to explicitly handle MatchException when
using pattern matching so what does wrapping gain us here?

--Dan

Re: [External] : Re: Remainder in pattern matching

2022-03-30 Thread Brian Goetz





It seems that what you are saying is that you think an Exception is 
better than an Error.


Not exactly; what I'm saying is that the attempt to separate stray nulls 
from separate compilation issues here seems like a heroic effort for low 
value, and I'd rather have one channel for "exhaustiveness failure" and 
let implementations decide how heroic they want to get in sorting out 
the possible causes.

Re: [External] : Re: Remainder in pattern matching

2022-03-30 Thread Brian Goetz

It's a little like calling a method, but a little not like it too. For 
example, when you match on a record pattern:


    case Point(var x, var y): ...

what may happen is *either* you will invoke a user-written deconstructor 
pattern, *or* we will test if you are a Point with `instanceof`, and 
then invoke the accessor methods (which might be user-written or 
implicit.)  Similarly, if you match:


    case Point(P, Q):
    case Point(R, S):

we may invoke the Point deconstructor once, or twice.  And there's no 
way to _directly_ invoke a pattern, only through switch, instanceof, and 
other contexts.


All of this means that invocations of pattern methods is more indirect, 
and mediated by the language, than invoking a method. When you invoke a 
method, you are assenting to its contract about what it returns, what it 
throws, etc.  When you match a pattern, it feels more likely are 
assenting to the contract of _pattern matching_, which in turn hides 
implementation details of what pattern methods are invoked, when they 
are invoked, how often, etc.


Dtors and record accessors cannot throw checked exceptions at all, and 
will be discouraged from throwing exceptions at all.


One thing wrapping gains is that it gives us a place to centralize 
"something failed in pattern matching", which includes exhaustiveness 
failures as well as failures of invariants which PM assumes (e.g., dtors 
don't throw.)   Another thing it gains is that it discourages people 
from thinking they can use exceptions in dtors; having these laundered 
through MatchException discourages using this as a side channel, though 
that's a more minor thing.


Agree we do not expect users to explicitly handle ME, any more so than NPE.


My intuition (and maybe I have the wrong mental model?) is that the
pattern matching calling a user written dtor / record accessor is akin
to calling a method.  We don't wrap the exceptions thrown by methods
apart from some very narrow cases (ie: reflection), and I thought part
of reflection's behaviour was related to needing to ensure exceptions
(particularly checked ones) were converted to something explicitly
handled by the caller.

If the dtor / record accessor can declare they throw checked
exceptions, then I can kind of see the rationale for wrapping them.
Otherwise, it seems clearer to me to let them be thrown without
wrapping.

I don't think we expect users to explicitly handle MatchException when
using pattern matching so what does wrapping gain us here?

--Dan

Re: [External] : Re: Remainder in pattern matching

2022-03-30 Thread Brian Goetz





For when the static world and the dynamic world disagree, i think your 
analysis has miss an important question, switching on an enum throw an 
ICCE very late when we discover an unknown value, but in the case of a 
sealed type,


Actually, I thought about that quite a bit before proposing this. And my 
conclusion is: using ICCE was mostly a (well intentioned) mistake here, 
and "doubling down" on that path is more trouble than it is worth.  So 
we are minimally consistent with the ICCE choice in the cases that were 
compilable in 12, but for anything else, we follow the general rule.


The thought experiment that I did was: what if we had not done switch 
expressions in 12.  Then the only precedent we have to deal with is the 
null case, which has a pretty obvious answer.  So what would we do?  
Would we introduce 10s of catch-all cases solely for the purpose of 
diagnosing the source of remainder, or would we introduce a throwing 
default that throws MatchException on everything but null?  I concluded 
we would do the latter, so what is proposed here is basically that, but 
carving out the 12-compatibility case.


Remainders are dangling else in a cascade of if ... else, so yes, we 
have to care of them.


Yes, but we can care for all of them in one swoop with a synthetic default.

So yes, it may a lot of bytecodes if we choose to add all branches but 
the benefit is not questionable, it's far better than the alternative 
which is GoodLuckFigureByYourselfException.


Yes, when you get a dynamic error here in a complex switch, the range of 
what could have gone wrong is large.  (The same will be true outside of 
switches when we have more kinds of patterns (list patterns, map 
patterns, etc) and more ways to compose patterns into bigger patterns; 
if we have a big complex pattern that matches the JSON document with the 
keys we want, if it doesn't match because (say) some integer nested nine 
levels deep overflowed 32 bits, this is also going to be hard to 
diagnose.)  But you are proposing a new and significant language 
requirement -- that the language should mandate an arbitrarily complex 
explanation of why something didn't match.  I won't dispute that this 
has benefit -- but I am not convinced this is necessarily the place for 
this, or whether the cost is justified by the benefit.


Also, note that the two are not inconsistent.  If the switch is required 
to throw MatchException on remainder, the compiler is *allowed* to try 
and diagnose the root cause (the ME can wrap something more specific), 
but not required to.   Pattern failure diagnosis then becomes a quality 
of implementation choice, rather than having complex, brittle rules 
mandated by the spec.  There's nothing to stop us from doing the 
equivalent of the "helpful NPE" JEP in the future.

Remainder in pattern matching

2022-03-30 Thread Brian Goetz

We should have wrapped this up a while ago, so I apologize for the late 
notice, but we really have to wrap up exceptions thrown from pattern 
contexts (today, switch) when an exhaustive context encounters a 
remainder.  I think there's really one one sane choice, and the only 
thing to discuss is the spelling, but let's go through it.


In the beginning, nulls were special in switch.  The first thing is to 
evaluate the switch operand; if it is null, switch threw NPE.  (I don't 
think this was motivated by any overt null hostility, at least not at 
first; it came from unboxing, where we said "if its a box, unbox it", 
and the unboxing throws NPE, and the same treatment was later added to 
enums (though that came out in the same version) and strings.)


We have since refined switch so that some switches accept null. But for 
those that don't, I see no other move besides "if the operand is null 
and there is no null handling case, throw NPE." Null will always be a 
special remainder value (when it appears in the remainder.)


In Java 12, when we did switch expressions, we had to confront the issue 
of novel enum constants.  We considered a number of alternatives, and 
came up with throwing ICCE.  This was a reasonable choice, though as it 
turns out is not one that scales as well as we had hoped it would at the 
time.  The choice here is based on "the view of classfiles at compile 
time and run time has shifted in an incompatible way."  ICCE is, as 
Kevin pointed out, a reliable signal that your classpath is borked.


We now have two precedents from which to extrapolate, but as it turns 
out, neither is really very good for the general remainder case.


Recall that we have a definition of _exhaustiveness_, which is, at some 
level, deliberately not exhaustive.  We know that there are edge cases 
for which it is counterproductive to insist that the user explicitly 
cover, often for two reasons: one is that its annoying to the user 
(writing cases for things they believe should never happen), and the 
other that it undermines type checking (the most common way to do this 
is a default clause, which can sweep other errors under the rug.)


If we have an exhaustive set of patterns on a type, the set of possible 
values for that type that are not covered by some pattern in the set is 
called the _remainder_.  Computing the remainder exactly is hard, but 
computing an upper bound on the remainder is pretty easy.  I'll say "x 
may be in the remainder of P* on T" to indicate that we're defining the 
upper bound.


 - If P* contains a deconstruction pattern P(Q*), null may be in the 
remainder of P*.
 - If T is sealed, instances of a novel subtype of T may be in the 
remainder of P*.
 - If T is an enum, novel enum constants of T may be in the remainder 
of P*.
 - If R(X x, Y y) is a record, and x is in the remainder of Q* on X, 
then `R(x, any)` may be in the remainder of { R(q) : q in Q*} on R.


Examples:

    sealed interface X permits X1, X2 { }
    record X1(String s) implements X { }
    record X2(String s) implements X { }

    record R(X x1, X x2) { }

    switch (r) {
 case R(X1(String s), any):
 case R(X2(String s), X1(String s)):
 case R(X2(String s), X2(String s)):
    }

This switch is exhaustive.  Let N be a novel subtype of X.  So the 
remainder includes:


    null, R(N, _), R(_, N), R(null, _), R(X2, null)

It might be tempting to argue (in fact, someone has) that we should try 
to pick a "root cause" (null or novel) and throw that.  But I think this 
is both excessive and unworkable.


Excessive: This means that the compiler would have to enumerate the 
remainder set (its a set of patterns, so this is doable) and insert an 
extra synthetic clause for each.  This is a lot of code footprint and 
complexity for a questionable benefit, and the sort of place where bugs 
hide.


Unworkable: Ultimately such code will have to make an arbitrary choice, 
because R(N, null) and R(null, N) are in the remainder set.  So which is 
the root cause?  Null or novel?  We'd have to make an arbitrary choice.



So what I propose is the following simple answer instead:

 - If the switch target is null and no case handles null, throw NPE.  
(We know statically whether any case handles null, so this is easy and 
similar to what we do today.)
 - If the switch is an exhaustive enum switch, and no case handles the 
target, throw ICCE.  (Again, we know statically whether the switch is 
over an enum type.)
 - In any other case of an exhaustive switch for which no case handles 
the target, we throw a new exception type, java.lang.MatchException, 
with an error message indicating remainder.


The first two rules are basically dictated by compatibility.  In 
hindsight, we might have not chosen ICCE in 12, and gone with the 
general (third) rule instead, but that's water under the bridge.


We need to wrap this up in the next few days, so if you've concerns 
here, please get them on the record ASAP.



As

Re: [External] : Re: Declared patterns -- translation and reflection

2022-03-30 Thread Brian Goetz



It's not by name.  I don't know where you got this idea. 



I think i understand the underlying semantics of the syntax, i'm not 
100% confident.


It's always OK to ask questions if you are not 100% sure!   In fact, its 
generally better to do so.


The problem with the proposed syntax is that you invent a new kind of 
variable, until now, we had local variables and fields (and array 
cells but those have no name).


It's valid to have concerns, but (a) please try to understand the entire 
design space before declaring it a "problem" (questions are OK), and (b) 
please wait until we're actually having that discussion.  This is a big, 
complex design space, and there is a clean separation between the user 
model of how it is declared and how it is rendered in the classfile, so 
I'm trying to keep the conversation focused so we can make progerss.  
Please work with me on this.


Once we have pattern methods, we can have an interface that defines a 
pattern method and a class that implement it,

something like


As the "Patterns in the Object Model" document says, yes, patterns make 
sense in interface.  The best example is probably Map.Entry:


    for (Entry(var k, var v) : map.entrySet()) { ... }

These would translate (this conversation is about translation) as 
pattern methods in the interface.  (I haven't thought much about whether 
default implementations make sense.)


Do we agree that a binding type can be covariant ? (before saying no, 
think about generics that's the reason we have return type covariance 
in Java).

In that case, are we are in trouble with the translation strategy ?


Its a fair question about whether we want this.  When the bindings act 
as a "multiple return bundle", though "covariant return" becomes much 
more complicated; you'd probably need some kind of "meet" restriction 
which says that for any two overrides X and Y that are more specific 
than Z, there is a "meet" W that is more specific than either X and Y.  
Not sure it is worth going there.


It's also a fair question about how it works out in translation, I'll 
think about this.


Pattern methods (static or not) does not have a real name, so '<' and 
'>' are here to signal that the name is in the Pattern attribute.
We do not want people to unmangle the name of pattern methods that why 
the name is in the attribute, using '<' and '>' signal that idea.


Yes and no.  Remember that the non-dtor patterns in the source file have 
names, and they can be overloaded:


    class X {
    __pattern(bindings) p(args1) { ... }
    __pattern(bindings) p(args2) { ... }
    __pattern(bindings) q(args1) { ... }
    __pattern(bindings) q(args2) { ... }
 }

The mangled name must be unique for each of these, but the first two 
must be derived in part from "p" (so that when the file is recompiled, 
we come up with the same name).  Only dtors are "nameless" and need a 
standin name like  (or the class name, or any arbitrary spelling 
rule we want to make.)   So while the translated name not be exactly 
name$mangle, the name is important (it can just go into the mangled part 
if we like.)


As with other synthetic members, like bridge methods, we can make it a 
compile-time error to try to override it as a method rather than as a 
pattern.

Re: [External] : Re: Declared patterns -- translation and reflection

2022-03-29 Thread Brian Goetz





The mangling has to be stable across compilations with respect to
any source- and binary-compatible changes to the pattern
declaration.  One mangling that works quite well is to use the
"symbolic-freedom encoding" of the erasure of the pattern
descriptor.  Because the erasure of the descriptor is exactly as
stable as any other method signature derived from source
declarations, it will have the desired binary compatibility
properties, overriding will work as expected, etc. 



I think we need a least to use a special name like  the 
same way we have .


Yes.  Instance/static patterns will have names, so for them, we'll use 
the name as declared in the source.  Dtors have no names, just like 
ctors, so we have to invent something to stand in for that.  or 
similar is fine.


I agree that we also need to encode the method type descriptor (the 
carrier type) into the name, so the name of the method in the 
classfile should be  or  (or 
perhaps  ofr the pattern methods).


The key constraint is that the mangled name be stable with respect to 
compatible changes in the declaration.  The rest is just "classfile 
syntax."





 Return value

In an earlier design, we used a pattern object (which was a bundle
of method handles) as the return value of the pattern.  This
enabled clients to invoke these via condy and bind method handles
into the constant pool for deconstruction and static patterns.

Either way, we make use of some sort of carrier object to carry
the bindings from the pattern to the client; either we return the
carrier from the pattern method, or there is a method on the
pattern object that we invoke to get a carrier.  We have a few
preferences about the carrier; we'd like to be able to late-bind
to the actual implementation (i.e., we don't want to freeze the
name of a carrier class in the method descriptor), and at least
for records, we'd like to let the record instance itself be the
carrier (since it is immutable and we can just invoke the
accessors to get the bindings.)


So the return type is either Object (too hide the type of the carrier) 
or a lambda that returns an Object (PatternObject or PatternCarrier 
acting like a glorified lambda).


If the pattern method actually runs the match, then I think Object is 
right.  If the method returns a constant bundle of method handles, then 
it can return something like PatternHandle or a matcher lambda.  But I 
am no longer seeing the benefit in this extra layer of indirection, 
given how the other translation work has played out.





    Pattern {
    u2 attr_name;
    u4 attr_length;
    u2 patternFlags; // bitmask
    u2 patternName;  // index of UTF8 constant
    u2 patternDescr; // index of MethodType (or alternately
UTF8) constant
    u2 attributes_count;
    attribute_info attributes[attributes_count];
    }

This says that "this method is a pattern", reifies the name of the
pattern (patternName), reifies the pattern descriptor
(patternDescr) which encodes the types of the bindings as a method
descriptor or MethodType, and has attributes which can carry
annotations, parameter metadata, and signature metadata for the
bindings.   The existing attributes (e.g. Signature,
ParameterNames, RVAA) can be reused as is, with the interpretation
that this is the signature (or names, or annos) of the *bindings*,
not the input parameters.  Flags can carry things like
"deconstructor pattern" or "partial pattern" as needed. 



From the classfile POV, a constructor is a method with a funny name in 
between brackets, i think deconstructor and pattern methods should 
work the same way.


Be careful of extrapolating from one data point.  Dtor are only one form 
of declared patterns; we also have to accomodate static and instance 
patterns.

Re: [External] : Re: Declared patterns -- translation and reflection

2022-03-29 Thread Brian Goetz





1/ conceptually there is a mismatch, the syntax introduce names for 
the bindings, but they have no names at that point, bindings only have 
names AFTER the pattern matching succeed.


I think you have missed the point here.  The names serve the 
implementation of the pattern, not the interface -- just as parameter 
names to methods do.   As you see in the example, these are effectively 
blank final locals in the body of the pattern, which must be assigned 
to.  (I'd have pointed this out if this were actually a message on 
declaring deconstructors, but since the message is on translation and 
reflection I didn't want to digress.)


2/ sending the value of the binding by name is alien to Java. In Java, 
sending values is by the position of the value.


It's not by name.  I don't know where you got this idea.

3/ the conceptual mismatch also exists at runtime, you need to permute 
the value of bindings before creating the carrier because a carrier 
takes the value of the binding by position while the code will takes 
the value of the bindings by name (you need the equivalent of 
MethodHandles.permuteArguments() otherwise you will see the 
re-organisation of the code if they are side effects).


It's not by name.  I don't know where you got this idea.

Re: [External] : Re: Declared patterns -- translation and reflection

2022-03-29 Thread Brian Goetz

I am disappointed that you took this as an invitation to digress into 
syntax here, when it should have been blindingly obvious that this was 
not the time for a syntax discussion.  (And when there is a syntax 
discussion, which this isn't, we need to cover all the different forms 
of declared patterns together; trying to design dtor patterns in a 
vacuum misses a number of considerations.)


I'll respond to your other points separately.

On 3/29/2022 6:19 PM, Remi Forax wrote:





*From: *"Brian Goetz" 
*To: *"amber-spec-experts" 
*Sent: *Tuesday, March 29, 2022 11:01:18 PM
*Subject: *Declared patterns -- translation and reflection

Time to take a peek ahead at _declared patterns_.  Declared
patterns come in three varieties -- deconstruction patterns,
static patterns, and instance patterns (corresponding to
constructors, static methods, and instance methods.)  I'm going to
start with deconstruction patterns, but the basic game is the same
for all three. 



I mostly agree with everything said apart from the syntax of a 
deconstructor

(see my next message about how small things can be improved).

I have several problems with the proposed syntax for a deconstructor.
 I can see the appeal of having a code very similar to a constructor 
but it's a trap, a constructor and a deconstructor do not have the 
same semantics, a constructor initialize fields (which have a name) 
while a deconstructor (or a pattern method) initialize bindings which 
does not have a name at that point yet.


1/ conceptually there is a mismatch, the syntax introduce names for 
the bindings, but they have no names at that point, bindings only have 
names AFTER the pattern matching succeed.
2/ sending the value of the binding by name is alien to Java. In Java, 
sending values is by the position of the value.
3/ the conceptual mismatch also exists at runtime, you need to permute 
the value of bindings before creating the carrier because a carrier 
takes the value of the binding by position while the code will takes 
the value of the bindings by name (you need the equivalent of 
MethodHandles.permuteArguments() otherwise you will see the 
re-organisation of the code if they are side effects).


Let's try to come with a syntax,
as i said, bindings have no names at that point so the deconstructor 
should declare the bindings (int, int) and not (int x, int y),

so a syntax like

  _deconstructor_ (int, int) {
   _send_bindings_(this.x, this.y);
  }

Here the syntax shows that the value of the bindings are assigned 
following the position of the expression like usual in Java.



We can discuss if _send_bindings_ should be "return" or another 
keyword and if the binding types should be declared before or after 
_deconstructor_.


By example, if you wan to maintain a kind of symmetry with the 
constructor, we can reuse the name of the class instead of 
_deconstructor_ and move the binding types in front of the name of the 
class to show that the bindings move from the class to the pattern 
matching in the same direction like a return type of a method.

Something like this:
  (int, int) Point {
   _send_bindings_(this.x, this.y);
  }

To summarize, the proposed syntax does the convey the underlying 
semantics of the bindings initialization and make things more 
confusing than it should.




Ignoring the trivial details, a deconstruction pattern looks like
a "constructor in reverse":

```{.java}
class Point {
    int x, y;

    Point(int x, int y) {
    this.x = x;
    this.y = y;
    }

deconstructor(int x, int y) {
    x = this.x;
    y = this.y;
    }
}
```


Rémi

Declared patterns -- translation and reflection

2022-03-29 Thread Brian Goetz

Time to take a peek ahead at _declared patterns_.  Declared patterns 
come in three varieties -- deconstruction patterns, static patterns, and 
instance patterns (corresponding to constructors, static methods, and 
instance methods.)  I'm going to start with deconstruction patterns, but 
the basic game is the same for all three.


Ignoring the trivial details, a deconstruction pattern looks like a 
"constructor in reverse":


```{.java}
class Point {
    int x, y;

    Point(int x, int y) {
    this.x = x;
    this.y = y;
    }

    deconstructor(int x, int y) {
    x = this.x;
    y = this.y;
    }
}
```

Deconstruction patterns share the weird behaviors that constructors have 
in that they are instance members, but are not inherited, and that 
rather having names, they are accessed via the class name.


Deconstruction patterns differ from static/instance patterns in that 
they are by definition total; they cannot fail to match. (This is a 
somewhat arbitrary simplification in the object model, but a reasonable 
one.)  They also cannot have any input parameters, other than the receiver.


Patterns differ from their ctor/method counterparts in that they have 
what appear to be _two_ argument lists; a parameter list (like ctors and 
methods), and a _binding_ list.  The parameter list is often empty (with 
the receiver as the match target). The binding list can be thought of as 
a "conditional multiple return".  That they may return multiple values 
(and, for partial patterns, can return no values at all when they don't 
match) presents a challenge for translation to classfiles, and for the 
reflection model.


 Translation to methods

Patterns contain imperative code, so surely we want to translate them to 
methods in some way.  The pattern input parameters map cleanly to method 
parameters.


The pattern bindings need to tunneled, somehow, through the method 
return (or some other mechanism).  For our deconstructor, we might 
translate as:


    PatternCarrier ()

(where the method applies the pattern, and PatternCarrier wraps and 
provides access to the bindings) or


    PatternObject ()

(where PatternObject provides indirection to behavior to invoke the 
pattern, which in turn returns the carrier.)


With either of these approaches, though, the pattern name is a problem, 
because patterns can be overloaded on their _bindings_, but both of 
these return types are insensitive to bindings.


It is useful to characterize the "shape" of a pattern with a MethodType, 
where the parameters of the MethodType are the binding types.  (The 
return type is less constrained, but it is sometimes useful to use the 
return type of the MethodType for the required type of the pattern.)  
Call this the "descriptor" of the pattern.


If we do this, we can use some name mangling to encode the descriptor in 
the method name:


    PatternCarrier name$mangle()

The mangling has to be stable across compilations with respect to any 
source- and binary-compatible changes to the pattern declaration.  One 
mangling that works quite well is to use the "symbolic-freedom encoding" 
of the erasure of the pattern descriptor.  Because the erasure of the 
descriptor is exactly as stable as any other method signature derived 
from source declarations, it will have the desired binary compatibility 
properties, overriding will work as expected, etc.


 Return value

In an earlier design, we used a pattern object (which was a bundle of 
method handles) as the return value of the pattern. This enabled clients 
to invoke these via condy and bind method handles into the constant pool 
for deconstruction and static patterns.


Either way, we make use of some sort of carrier object to carry the 
bindings from the pattern to the client; either we return the carrier 
from the pattern method, or there is a method on the pattern object that 
we invoke to get a carrier.  We have a few preferences about the 
carrier; we'd like to be able to late-bind to the actual implementation 
(i.e., we don't want to freeze the name of a carrier class in the method 
descriptor), and at least for records, we'd like to let the record 
instance itself be the carrier (since it is immutable and we can just 
invoke the accessors to get the bindings.)


 Carriers

As part of the work on template strings, Jim has put back some code that 
was originally written for the purpose of translating patterns, called 
"carriers".  There are methods / bootstraps that take a MethodType and 
return method handles to (a) encode values of those types into an opaque 
carrier object and (b) pull individual values out of a carrier.  This 
means that the choice of carrier object can be deferred to runtime, as 
long as both the bundling and unbundling methods handles agree on the 
carrier form.


The choice of carrier is largely a footprint/specificity tradeoff.  One 
could imagine a carrier class per shape, or a single carrier class that 
wraps an Object[], or

Re: [External] : Re: Pattern assignment

2022-03-28 Thread Brian Goetz





There are another different between assignment and _let_, a _let_ 
creates new fresh local variables (binding) while assignment is able 
to reuse an existing local variable.


Correct, the more precise analogy is not to _assignment_, but to _local 
variable declaration with initialization_ (whose semantics are derived 
from assignment.)


In Java, the if statement is used a lot (too much IMO but i don't 
think we should fight to change that) so it may make sense to be able 
to reuse an existing local variables.


Yes, this has come up before.  I agree that there are cases where we 
might want this (there's one distinguished case where we almost cannot 
avoid this), but in general, I am pretty reluctant to go there -- I 
think this is incremental complexity (and encouragement of more 
mutability) with not enough commensurate benefit.






## Possible extensions

There are a number of ways we can extend `let` statements to make
it more
useful; these could be added at the same time, or at a later time.

 What about partial patterns?

There are times when it may be more convenient to use a `let` even
when we know
the pattern is partial.  In most cases, we'll still want to
complete abruptly if the
pattern doesn't match, but we may want to control what happens. 
For example:

```
let Optional.of(var contents) = optName
else throw new IllegalArgumentException("name is empty");
```

Having an `else` clause allows us to use a partial pattern, which
receives
control if the pattern does not match.  The `else` clause could
choose to throw,
but could also choose to `break` or `return` to an enclosing
context, or even
recover by assigning the bindings. 



I don't like that because in that case "let pattern else ..." is 
equivalent of "if instanceof pattern else ... " with the former being 
expression oriented and the later statement oriented.
As i said earlier, i don't think we should fight the fact that Java is 
statement oriented by adding expression oriented variations of 
existing constructs.


We haven't talked about let expressions yet; this is still a statement.

It's a fair point to say that the above example could be rewritten as an 
if-else, and when the else throws unconditionally, we still get the same 
scoping.  Or that it can be rewritten as


    if (!(pattern match))
    throw blah

On the other hand, people don't particularly like having to invert the 
match like this just to get the scoping they want.


In any case, the real value of the else block is where you want to 
continue (and merge the control flow) with default values of the 
bindings set in the else clause (next section).  Dropping "else" makes 
this extremely messy.  And once you have else, the rest comes for the ride.





 What about recovery?

If we're supporting partial patterns, we might want to allow the
`else` clause
to provide defaults for the bindings, rather than throw.  We can
make the bindings of the
pattern in the `let` statement be in scope, but definitely
unassigned, in the
`else` clause, which means the `else` clause could initialize them
and continue:

```
let Optional.of(var contents) = optName
else contents = "Unnamed";
```

This allows us to continue, while preserving the invariant that
when the `let`
statement completes normally, all bindings are DA. 



It fails if the "then" part or the "else" part need more than one 
instruction.

Again, it's statement vs expression.


No, it's still a statement.  I don't know where you're getting this 
"statement vs expression" thing from?





 What about guards

If we're supporting partial patterns, we also need to consider the
case where
the pattern matches but we still want to reject the content.  This
could of
course be handled by testing and throwing after the `let`
completes, but if we
want to recover via the `else` clause, we might want to handle
this directly.
We've already introduced a means to do this for switch cases -- a
`when` clause
-- and this works equally well in `let`:

```
let Point(var x, var y) = aPoint
when x >= 0 && y >= 0
else { x = y = 0; }
```


It can be re-written using an if instanceof, so i do not think we need 
a special syntax


  int x, y;
  if (!(aPoint instanceof Point(_ASSIGN_ x, _ASSIGN_ y) && x >= 0 && y 
>= 0)) {

   x = 0;
   y = 0;
  }


All let statements can be rewritten as instanceof.  Are you arguing that 
the whole idea is silly?




"Let ... in" is useful but i don't think it's related to the current 
proposal, for me it's orthogonal. We can introduce "let ... in" 
independently to the pattern assignment idea,
and if the pattern assignment is already in the language, then "let 
... in" will support it.


Yes and no.  You are correct that we could do either or both 
independently.  But it's not my

Pattern assignment

2022-03-25 Thread Brian Goetz

We still have a lot of work to do on the current round of pattern 
matching (record patterns), but let's take a quick peek down the road.  
Pattern assignment is a sensible next building block, not only because 
it is directly useful, but also because it will be required for 
_declaring_ deconstruction patterns in classes (that's how one pattern 
delegates to another.)  What follows is a rambling sketch of all the 
things we _could_ do with pattern assignment, though we need not do all 
of them initially, or even ever.



# Pattern assignment

So far, we've got two contexts in the language that can accommodate 
patterns --
`instanceof` and `switch`.  Both of these are conditional contexts, 
designed for

dealing with partial patterns -- test whether a pattern matches, and if so,
conditionally extract some state and act on it.

There are cases, though, when we know a pattern will always match, in 
which case
we'd like to spare ourselves the ceremony of asking.  If we have a 3d 
`Point`,

asking if it is a `Point` is redundant and distracting:

```
Point p = ...
if (p instanceof Point(var x, var y, var z)) {
    // use x, y, z
}
```

In this situation, we're asking a question to which we know the answer, and
we're distorting the structure of our code to do it.  Further, we're 
depriving
ourselves of the type checking the compiler would willingly do to 
validate that
the pattern is total.  Much better to have a way to _assert_ that the 
pattern

matches.

## Let-bind statements

In such a case, where we want to assert that the pattern matches, and 
forcibly

bind it, we'd rather say so directly.  We've experimented with a few ways to
express this, and the best approach seems to be some sort of `let` 
statement:


```
let Point(var x, var y, var z) p = ...;
// can use x, y, z, p
```

Other ways to surface this might be to call it `bind`:

```
bind Point(var x, var y, var z) p = ...;
```

or even use no keyword, and treat it as a generalization of assignment:

```
Point(var x, var y, var z) p = ...;
```

(Usual disclaimer: we discuss substance before syntax.)

A `let` statement takes a pattern and an expression, and we statically 
verify
that the pattern is exhaustive on the type of the expression; if it is 
not, this is a

type error at compile time.  Any bindings that appear in the pattern are
definitely assigned and in scope in the remainder of the block that 
encloses the

`let` statement.

Let statements are also useful in _declaring_ patterns; just as a subclass
constructor will delegate part of its job to a superclass constructor, a
subclass deconstruction pattern will likely want to delegate part of its 
job to
a superclass deconstruction pattern.  Let statements are a natural way 
to invoke

total patterns from other total patterns.

 Remainder

Let statements require that the pattern be exhaustive on the type of the 
expression.
For total patterns like type patterns, this means that every value is 
matched,

including `null`:

```
let Object o = x;
```

Whatever the value of `x`, `o` will be assigned to `x` (even if `x` is null)
because `Object o` is total on `Object`.  Similarly, some patterns are 
clearly

not total on some types:

```
Object o = ...
let String s = o;  // compile error
```

Here, `String s` is not total on `Object`, so the `let` statement is not 
valid.

But as previously discussed, there is a middle ground -- patterns that are
_total with remainder_ -- which are "total enough" to be allowed to be 
considered

exhaustive, but which in fact do not match on certain "weird" values. An
example is the record pattern `Box(var x)`; it matches all box 
instances, even

those containing null, but does not match a `null` value itself (because to
deconstruct a `Box`, we effectively have to invoke an instance member on the
box, and we cannot invoke instance members on null receivers.) 
Similarly, the

pattern `Box(Bag(String s))` is total on `Box>`, with remainder
`null` and `Box(null)`.

Because `let` statements guarantee that its bindings are definitely assigned
after the `let` statement completes normally, the natural thing to do when
presented with a remainder value is to complete abruptly by reason of 
exception.

(This is what `switch` does as well.)  So the following statement:

```
Box> bbs = ...
let Box(Bag(String s)) = bbs;
```

would throw when encountering `null` or `Box(null)` (but not 
`Box(Bag(null))`,
because that matches the pattern, with `s=null`, just like a switch 
containing

only this case would.

 Conversions

JLS Chapter 5 ("Conversions and Contexts") outlines the conversions 
(widening,

narrowing, boxing, unboxing, etc) that are permitted in various contexts
(assignment, loose method invocation, strict method invocation, cast, etc.)
We need to define the set of conversions we're willing to perform in the 
context

of a `let` statement as well; which of the following do we want to support?

```
let int x = aShort; // primitive widening
let byte b = 0;

Re: [External] : Re: Pattern coverage

2022-03-24 Thread Brian Goetz

Right, in this model "default" clauses map to "any" patterns.  It 
doesn't (yet) deal with remainder, but that will come in a separate 
section.  This is all about static type checking.  Also, the last two 
rules probably leave out some of the generics support, but that's not 
essential to the model; we're mostly trying to make sure we understand 
what exhaustiveness is, in a way that it can be specified.




On 3/24/2022 1:56 PM, Remi Forax wrote:

Thanks for sharing,
in the text, they are several mentions of the default pattern but the 
default pattern is not defined.


Rémi

----

*From: *"Brian Goetz" 
*To: *"amber-spec-experts" 
*Sent: *Thursday, March 24, 2022 6:39:21 PM
*Subject: *Pattern coverage

I've put a document at

http://cr.openjdk.java.net/~briangoetz/eg-attachments/Coverage.pdf

which outlines a formal model for pattern coverage, including
record patterns and the effects of sealing. This refines the work
we did earlier.  The document may be a bit rough so please let me
know if you spot any errors.  The approach here should be more
amenable to specification than the previous approach.

Pattern coverage

2022-03-24 Thread Brian Goetz


I've put a document at

http://cr.openjdk.java.net/~briangoetz/eg-attachments/Coverage.pdf

which outlines a formal model for pattern coverage, including record 
patterns and the effects of sealing.  This refines the work we did 
earlier.  The document may be a bit rough so please let me know if you 
spot any errors.  The approach here should be more amenable to 
specification than the previous approach.

Re: [External] : Re: Record pattern, the runtime side

2022-03-17 Thread Brian Goetz

On 3/16/2022 4:34 PM, fo...@univ-mlv.fr wrote:

- Original Message -

From: "Brian Goetz" 
To: "Remi Forax" , "amber-spec-experts" 

Sent: Wednesday, March 16, 2022 5:41:49 PM
Subject: Re: Record pattern, the runtime side

It works in 3 steps:
Step 1, at compile time, the compiler takes all the patterns and creates a tree
of pattern from the list of patterns,
pattern that starts with the same prefix are merged together.

We can "normalize" a complex pattern into a sequence of simpler
conditionals.  For example, matching the record pattern

     case Circle(Point(var x, var y), var r)

can be unrolled (and type inference applied) as

     x matches Circle(Point(var x, var y), var r)
     === x matches Circle(Point p, int r) && p matches Point(int x, int y)

Deconstruction patterns are known to have only an `instanceof`
precondition; the deconstructor body won't ever fail (unlike more
general static or instance patterns like Optional::of.)

If you define "matches" in term of instanceof this transformation does not work 
in the context of an assignment,
because you want
   Point(var x, var y) = null
to throw a NPE.

Yes, but this is not what matches means.  Matches is the three-place 
predicate that takes the static type of the target into account.  It 
differs from instanceof only at null, which is why I wrote "matches" 
rather than "instanceof".  You'll see in later rounds of lowering how 
this gets turned back into instanceof, taking into account the static 
type information.

But it's a very valid transformation if the pattern is not total and "matches" 
means instanceof in the context of a switch or instanceof and requireNonNull + cast in 
the context of an assignment.

Correct:

 P(Q) === P(var a) && a mathces Q  always, whereas
 P(Q) === P(var a) && a instanceof Q  **when Q is not unconditional 
on the target of P**

Updated terminology scorecard: a pattern P is *unconditional* on a type 
T if it matches all values of T; in other words, if it is not asking a 
question at all.  The only unconditional patterns are "any" patterns 
(_), "var" patterns, and total type patterns. Deconstruction patterns 
are never unconditional, because they don't match on nulls.

On the other hand, a pattern P is *exhaustive* on a type T if it is 
considered "good enough" for purposes of static type checking. 
Deconstruction patterns D(...) are exhaustive on types T <: D, even 
though they don't match null.  The difference is *remainder*.

Also from the runtime POV, a deconstructor and a pattern methods (static or 
instance) are identical, if we follow the idea of John to use null for not 
match. Obviously, it does not preclude us to differentiate between the two at 
the language level.

With one difference; the language makes deconstructors always total 
(they can't fail to match if the target is of the right type), whereas 
pattern methods can fail to match.  So in the translations where I write 
"c.deconstruct(...)" we are assuming that the deconstructor is always 
"true".

In the end, the tree of patterns is encoded in the bytecode as a tree of
constant dynamic (each Pattern is created only from constant and patterns).

With record patterns, we don't even need pattern descriptions, because
we can translate it all down to instanceof tests and invoking record
component accessors.  Of course, that ends when we have deconstruction
patterns, which correspond to imperative code; then having a Pattern
instantiation, and a way to get to its matching / binding-extraction
MHs, is needed.

Yes, record pattern is the last pattern we can implement in term of a cascade 
of if. I use that fact in the prototype, the runtime use the switch + type 
pattern because it does not use invokedynamic.

We can use a cascade of if either way, but record patterns are more 
"transparent" in that the compiler can lower the match criteria and 
extraction of bindings to primitives it already understands, whereas a 
method pattern is an opaque blob of code.

For the future, i'm not sure we will want to use invokedynamic for all 
patterns, indy is still quite slow until c2 kicks in.

It is a difficult tradeoff to decide what code to emit for 
narrowly-branching trees.  The indy approach means we can change plans 
later, but has a startup cost that may well not buy us very much; having 
a heuristic like "use if-else chains for fewer than five types" is brittle.

Re: Record pattern, the runtime side

2022-03-16 Thread Brian Goetz






It works in 3 steps:
Step 1, at compile time, the compiler takes all the patterns and creates a tree 
of pattern from the list of patterns,
pattern that starts with the same prefix are merged together.


We can "normalize" a complex pattern into a sequence of simpler 
conditionals.  For example, matching the record pattern


    case Circle(Point(var x, var y), var r)

can be unrolled (and type inference applied) as

    x matches Circle(Point(var x, var y), var r)
    === x matches Circle(Point p, int r) && p matches Point(int x, int y)

Deconstruction patterns are known to have only an `instanceof` 
precondition; the deconstructor body won't ever fail (unlike more 
general static or instance patterns like Optional::of.)  So we can 
further rewrite this as:


    x matches Circle(Point(var x, var y), var r)
    === x matches Circle(Point p, int r) && p matches Point(int x, int y)
    === x instanceof Circle c && c.deconstruct(Point p, int r) && p 
instanceof Point && p.deconstruct(int x, int y)


(where the "deconstruct" term invokes the deconstructor and binds the 
relevant values.)


If we are disciplined about the order in which we unroll (e.g., always 
depth-first and always left-to-right), with a modest amount of 
normalization, your "same pattern prefix" turns into the simpler "common 
prefix of normalized operations".  Record deconstructors can be further 
normalized, because the can be replaced with calling the accessors:


    x matches Circle(Point(var x, var y), var r)
    === x matches Circle(Point p, int r) && p matches Point(int x, int y)
    === x instanceof Circle c && (Point p = c.center()) && (int r = 
c.radius()) && p instanceof Point

&& (int x = p.x()) && (int y = p.y())

Of course, this is all very implementation-centric; factoring matching 
this way is somewhat unusual, since the arity and order of side-effects 
might be surprising to Java developers.  (Yes, having side-effects in 
pattern definitions is bad, but it may still happen.)  So the spec will 
have to do some fast dancing to allow this.



In the end, the tree of patterns is encoded in the bytecode as a tree of 
constant dynamic (each Pattern is created only from constant and patterns).


With record patterns, we don't even need pattern descriptions, because 
we can translate it all down to instanceof tests and invoking record 
component accessors.  Of course, that ends when we have deconstruction 
patterns, which correspond to imperative code; then having a Pattern 
instantiation, and a way to get to its matching / binding-extraction 
MHs, is needed.

Re: Record pattern: matching an empty record

2022-03-13 Thread Brian Goetz

Given a record R, and a record pattern R(P*), where P* is a list of 
nested patterns of the same arity as R's components, then


    x matches R(P*)

iff

    x instanceof R
    && R(var alpha*)  // always true, just binds
    && \forall i alpha_i matches P_i

If P* is empty, the last clause is vacuously true.

On 3/13/2022 11:09 AM, Remi Forax wrote:

Hi all,
while writing the prototype of the runtime,
i found a case i think we never discuss, can we match an empty record ?

record Empty() { }

switch(object) {
   case Empty() -> ...  // no binding here

I think the answer is yes because i don't see why we should do a special case 
for that, but i may be wrong.

Rémi

Re: Proposal: java.lang.runtime.Carrier

2022-03-09 Thread Brian Goetz

And in the future, when we have templated classes, some carriers may well 
become specializations of arity-indexed base classes (CarrierTuple1, 
CarrierTuple2, etc), where the VM takes responsibility for nasty things 
like when to unload specializations.

On Mar 9, 2022, at 12:23 PM, John Rose 
mailto:john.r.r...@oracle.com>> wrote:


ClassSpecializer is designed for cases beyond generating tuples, where some 
extra behavioral contract, and/or fixed field set, is required across all the 
generated classes.

That said, ClassSpecializer should support tuple generation nicely, for Carrier.

Maurizio’s point is a good one, although if I were Jim I’d hesitate to use 
something complicated to generate classes for just this one simple case. OTOH, 
our sense of what is “simple” sometimes needs adjustment. In the end, the class 
file generation might be simple, but the infrastructure of generating and 
registering classes (and allowing them to be unloaded in some cases) is rather 
subtle, and maintainers will thank us for centralizing it.

So, Jim, please do take a look at ClassSpecializer. It’s there for use cases 
like this one, even if in the end we don’t select it in this use case.

On 3 Mar 2022, at 10:49, Maurizio Cimadamore wrote:

Seems sensible.

As a possible "test", we could perhaps use this mechanism in the JDK 
implementation of LambdaForms? We do have places where we spin "species" 
classes:

https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/invoke/ClassSpecializer.java

(that said, maybe species classes contain a bit more than just data, so perhaps 
that's a wrong fit - but anyway, worth talking a look for possible code 
duplication).

Maurizio


On 03/03/2022 13:57, Jim Laskey wrote:

We propose to provide a runtime anonymous carrier class object generator; 
java.lang.runtime.Carrier. This generator class is designed to share anonymous 
classes when shapes are similar. For example, if several clients require 
objects containing two integer fields, then Carrier will ensure that each 
client generates carrier objects using the same underlying anonymous class.

Providing this mechanism decouples the strategy for carrier class generation 
from the client facility. One could implement one class per shape; one class 
for all shapes (with an Object[]), or something in the middle; having this 
decision behind a bootstrap means that it can be evolved at runtime, and 
optimized differently for different situations.

Motivation

The String Templates JEP 
draft proposes the 
introduction of a TemplatedString object for the primary purpose of carrying 
the template and associated values derived from a template literal. To avoid 
value boxing, early prototypes described these carrierobjects using 
per-callsite anonymous classes shaped by value types, The use of distinct 
anonymous classes here is overkill, especially considering that many of these 
classes are similar; containing one or two object fields and/or one or two 
integral fields. Pattern matching has a similar issue when carrying the values 
for the holes of a pattern. With potentially hundreds (thousands?) of template 
literals or patterns per application, we need to find an alternate approach for 
these value carriers.

Description

In general terms, the Carrier class simply caches anonymous classes keyed on 
shape. To further increase similarity in shape, the ordering of value types is 
handled by the API and not in the underlying anonymous class. If one client 
requires an object with one object value and one integer value and a second 
client requires an object with one integer value and one object value, then 
both clients will use the same underlying anonymous class. Further, types are 
folded as either integer (byte, short, int, boolean, char, float), long (long, 
double) or object. [We've seen that performance hit by folding the long group 
into the integer group is significant, hence the separate group.]

The Carrier API uses MethodType parameter types to describe the shape of a 
carrier. This incorporates with the primary use case where bootstrap methods 
need to capture indy non-static arguments. The API has three static methods;

// Return a constructor MethodHandle for a carrier with components
// aligning with the parameter types of the supplied methodType.
static MethodHandle constructor(MethodType methodType)

// Return a component getter MethodHandle for component i.
static MethodHandle component(MethodType methodType, int i)

// Return component getter MethodHandles for all the carrier's components.
static MethodHandle[] components(MethodType methodType)

Examples

import java.lang.runtime.Carrier;
...

// Define the carrier description.
MethodType methodType =
MethodType.methodType(Object.class, byte.class, short.class,
char.class, int.class, long.class,
float.class, double.class,
boolean.class, String.class);

//

Re: [External] : Re: Proposal: java.lang.runtime.Carrier

2022-03-09 Thread Brian Goetz

Also, i wonder if the external Carrier API should have a way to wrap an 
existing record class to see it as a Carrier, so the destructuring pattern will 
behave the same way with a record or with the result of a de-constructor.

Having records be their own carrier is an optimization we anticipate wanting to 
make, but I don’t think it is needed for Carrier to be involved in the 
deception. We’ll need a higher-level representation of a pattern (e.g., 
PatternHandle), which will exposes MHs for “try to match, return a carrier if 
it matches” and MHs for deconstructing the carrier.  A PH could use the record 
itself as the carrier, and the record’s accessor MHs as the component 
accessors, and not use Carrier at all, and the PH client can’t tell the 
difference.

So Carrier here is intended to be the lowest level of the stack; a building 
block for aggregating tuples, nothing more.  We can then build pattern matching 
atop that (which can use Carrier or not, as it sees fit) and switch dispatch 
atop that.

Re: [External] : Re: Proposal: java.lang.runtime.Carrier

2022-03-09 Thread Brian Goetz

>> 
>> The minimal constraint is that the return type of the constructor MH is the 
>> same type as the argument type of the component MHs.
> 
> Agreed.  The types should match but they shouldn't be considered part
> of the api.  I don't think (correct me if I'm wrong) that we want them
> to "escape" and be baked into classfiles.  The implementation of the
> anonymous class holding the fields ("holder object") should remain as
> a hidden implementation detail.  One way to do that is to enforce that
> the holder object is always hidden behind other public types like
> Object.

Yes.  So my question is, who does the laundry?  Is it the carrier API (who 
always says Object), or the caller who is going to take the return value of the 
carrier constructor and stick it in an Object?  Does it make a difference?  If 
I take the constructor MH, and compose it with the component MHs, will having 
an extraneous Object signature make it harder to expose the true type (which 
may be a Q type), or will that routinely and reliably come out in the wash 
anyway?  

>> It would seem to me that preserving stronger types here dynamically gives MH 
>> combinators more room to optimize?
> 
> Only the outer edge of the MH chain would need to return (constructor)
> / take (component) Object.  The implementation of the MHs can use a
> sharper type.  I don't think we gain any optimization abilities here
> by exposing the sharper type - at worst there's an asType operation to
> check the type but that shouldn't be visible in the performance
> profile.

OK, so you’re saying it’s fine to slap an Object label on it, as it will come 
off easily when needed.

Re: [External] : Re: Proposal: java.lang.runtime.Carrier

2022-03-09 Thread Brian Goetz


What i was proposing is for switch to cram "not match" and the index of the 
matching case into one int because using -1 seems natural and it will work well 
with the tableswitch.

There’s two levels here, and I think part of the confusion with regard to 
pattern translation is we’re talking at different levels.

The first level is: I’ve written a deconstructor for class Foo, and I want to 
match it with instanceof, case, whatever.  I need a way to “invoke” the pattern 
and let it conditionally “return” multiple values.  Carrier is a good tool for 
this job.

The second level is: I want to use indy to choose which branch of a switch to 
take, *and* at the same time, carry all the values needed to that branch.  
Carrier could be applied to this as well.

Somewhere in between, there’s the question of how we roll up the values in a 
compound pattern (e.g., Circle(Point(var x, var y) p, var r) c).  This could 
involve flattening all the bindings (x, y, p, r, c) into a fresh carrier, or it 
could involve a “carrier of carriers”.  There are many degrees of freedom in 
the translation story.

What Jim is proposing here is a runtime for bootstraps to make decomposable 
tuples that can be pass across boundaries that agree on a contract.  This could 
be a simple return-to-caller, or it could rely on sharing the carrier in the 
heap between entities that have a shared static typing proof.

To come back to the carrier API, does it means that the carrier class is always 
a nullable value type or does it mean that we need to knob to select between a 
primitive type or a value type ?

Probably the carrier can always be a *primitive* class type, and the null can 
be handled separately by boxing from QCarrier$33 to LCarrier$33.  All the 
Carrier API does is provide a constructor which takes N values and returns a 
carrier; at that point, you already know you want a non-null value.  Consumers 
higher up the food chain can opt into nullity.

Re: Proposal: java.lang.runtime.Carrier

2022-03-08 Thread Brian Goetz

The minimal constraint is that the return type of the constructor MH is the 
same type as the argument type of the component MHs.  It would seem to me that 
preserving stronger types here dynamically gives MH combinators more room to 
optimize?

> On Mar 8, 2022, at 4:25 PM, Dan Heidinga  wrote:
> 
> Hi Jim,
> 
> Will Carrier::constructor(MethodType) require that the MT's return
> type is Object.class and thus return a MH that returns an Object?  Or
> can other Classes / Interfaces be used as the return type?  Likewise,
> will Carrier::component(MethodType, int) only accept Object as the
> input argument?  The return type of the ::constructor generated MH
> will need to be the same as the input arg of the ::component
> accessors.
> 
> And do you see this api primarily being used in invokedynamic
> bootstrap methods?  I'm wondering how easy it will be to statically
> determine the set of MethodHandles required by uses of this API.
> Primarily for when we need to implement this in qbicc but also for
> other native image style projects.
> 
> --Dan
> 
> On Thu, Mar 3, 2022 at 8:58 AM Jim Laskey  wrote:
>> 
>> We propose to provide a runtime anonymous carrier class object generator; 
>> java.lang.runtime.Carrier. This generator class is designed to share 
>> anonymous classes when shapes are similar. For example, if several clients 
>> require objects containing two integer fields, then Carrier will ensure that 
>> each client generates carrier objects using the same underlying anonymous 
>> class.
>> 
>> Providing this mechanism decouples the strategy for carrier class generation 
>> from the client facility. One could implement one class per shape; one class 
>> for all shapes (with an Object[]), or something in the middle; having this 
>> decision behind a bootstrap means that it can be evolved at runtime, and 
>> optimized differently for different situations.
>> 
>> Motivation
>> 
>> The String Templates JEP draft proposes the introduction of a 
>> TemplatedString object for the primary purpose of carrying the template and 
>> associated values derived from a template literal. To avoid value boxing, 
>> early prototypes described these carrierobjects using per-callsite anonymous 
>> classes shaped by value types, The use of distinct anonymous classes here is 
>> overkill, especially considering that many of these classes are similar; 
>> containing one or two object fields and/or one or two integral fields. 
>> Pattern matching has a similar issue when carrying the values for the holes 
>> of a pattern. With potentially hundreds (thousands?) of template literals or 
>> patterns per application, we need to find an alternate approach for these 
>> value carriers.
>> 
>> Description
>> 
>> In general terms, the Carrier class simply caches anonymous classes keyed on 
>> shape. To further increase similarity in shape, the ordering of value types 
>> is handled by the API and not in the underlying anonymous class. If one 
>> client requires an object with one object value and one integer value and a 
>> second client requires an object with one integer value and one object 
>> value, then both clients will use the same underlying anonymous class. 
>> Further, types are folded as either integer (byte, short, int, boolean, 
>> char, float), long (long, double) or object. [We've seen that performance 
>> hit by folding the long group into the integer group is significant, hence 
>> the separate group.]
>> 
>> The Carrier API uses MethodType parameter types to describe the shape of a 
>> carrier. This incorporates with the primary use case where bootstrap methods 
>> need to capture indy non-static arguments. The API has three static methods;
>> 
>> // Return a constructor MethodHandle for a carrier with components
>> // aligning with the parameter types of the supplied methodType.
>> static MethodHandle constructor(MethodType methodType)
>> 
>> // Return a component getter MethodHandle for component i.
>> static MethodHandle component(MethodType methodType, int i)
>> 
>> // Return component getter MethodHandles for all the carrier's components.
>> static MethodHandle[] components(MethodType methodType)
>> 
>> Examples
>> 
>> import java.lang.runtime.Carrier;
>> ...
>> 
>> // Define the carrier description.
>> MethodType methodType =
>>MethodType.methodType(Object.class, byte.class, short.class,
>>char.class, int.class, long.class,
>>float.class, double.class,
>>boolean.class, String.class);
>> 
>> // Fetch the carrier constructor.
>> MethodHandle constructor = Carrier.constructor(methodType);
>> 
>> // Create a carrier object.
>> Object object = (Object)constructor.invokeExact((byte)0xFF, (short)0x,
>>'C', 0x, 0xL,
>>1.0f / 3.0f, 1.0 / 3.0,
>>true, "abcde");
>> 
>> // Get an array of accessors for the carrier object.
>> MethodHandle[] components = Carrier.components(methodType);
>> 
>> // Access fields.
>> byte b =

Re: [External] : Re: Proposal: java.lang.runtime.Carrier

2022-03-07 Thread Brian Goetz




Adding more information,
we want the carrier to be a primitive type (to be able to optimize it 
away), which means that we can not use null to represent "do_not_match",

we have to have a flag inside the carrier for that.


The alternate approach is to use a .ref class for partial patterns 
(using null for "no match") and a B3 class for total patterns (since it 
needs no failure channel.)


I think its pretty important that the static name of the carrier class 
not appear in generated bytecode.  As a result, we will have to use a 
reference type (object or interface), which means we get the null 
channel "for free".

Re: Telling the totality story

2022-03-05 Thread Brian Goetz





So, I think the main thing we can control about the story is the 
terminology.  I think part of what people find confusing is the use of 
the term "total", since that's a math-y term, and also it collides 
with "exhaustive", which is similar but not entirely coincident.


One concept we might want to appeal to here is the notion of a "pattern 
that isn't really asking a question".  This is still going to depend on 
the static type of the target, but rather than appealing to a 
mathematical term like "total" or "statically total", perhaps something 
like "a vacuous pattern" (much like the statement "all hairy hairless 
apes are green" is vacuously true), or a "self evident" pattern, or "an 
unconditional pattern" or an "effectively any" pattern or a "constant 
pattern" [1], or an "undiscerning pattern", or ...


Recall that the total / effectively any / unconditional / vacuous 
patterns are:


 - `var x`
 - an any pattern `_` , if we ever have it
 - A type pattern `T t`, when applied to a target `U <: T`.

All other patterns at least ask some question.

(FWIW, what the spec currently does is: syntactic patterns undergo a 
resolution to a runtime pattern as part of type analysis.  So given a 
target type of Object, the pattern `Object o` *resolves* to an any 
pattern, whereas `String s` resolves to the type pattern.  May or may 
not be helpful.)


Perhaps if we had a term for "a pattern that is not asking anything", 
this may help to frame the story better.



[1] Currently, we've been calling patterns that are just a literal (if 
we ever have them) "constant patterns", but we could also call those 
"literal patterns", if we wanted to reserve "constant" as a modifier to 
mean that the pattern is a constant.  Or not.

Re: [External] : Re: Yielding blocks

2022-03-05 Thread Brian Goetz

I certainly agree that blocks would need something to prefix them.  I 
don't necessarily object to the concept, but we have other, 
higher-leverage activities I'd rather pursue instead right now.


On 3/4/2022 9:38 PM, Tagir Valeev wrote:

For the record, I suggested a similar enhancement two years ago:
https://mail.openjdk.java.net/pipermail/amber-spec-experts/2020-March/002046.html
The difference is that I think that explicit prefix, like `do { ... }`
would be better. In particular, it helps to disambiguate between array
initializer and block expression..

With best regards,
Tagir Valeev

On Sat, Mar 5, 2022 at 12:36 AM Brian Goetz  wrote:

The following was received on the -comments list.

Summary: "Can we please have block expressions, you're almost there with the 
yielding block in switch expressions."

My observations: There was some discussion around the time we did switch expressions 
about whether we wanted a general-purpose block expression; it was certainly clear that 
we were inventing a limited form of block expression, and the worst thing about what we 
did back then was open the door to block expressions a bit, raising the inevitable 
questions about "why not just throw open the door".  While I can imagine 
wanting to open up on this at some point in the future, and am sympathetic to the notion 
that we want to express some things as expressions that currently require statements, I'm 
still not particularly motivated to throw open this door at this time.


 Forwarded Message 
Subject: Yielding blocks
Date: Fri, 4 Mar 2022 01:22:19 +0200
From: Dimitris Paltatzidis
To:amber-spec-comme...@openjdk.java.net


Methods, ignoring their arguments, form 2 categories:
1. Those that do not return a value (void).
2. Those that do.

Now, stripping away their signature, we get anonymous methods. Java supports
category 1, "blocks". They can be local, instance initializers or even
static.
On the other end, category 2 is missing. Consider the examples below of
trying
to initialize a local variable, with a complex computation.

A. (Verbose and garbage producing)
Type a = ((Supplier) () -> {
.
.
return .. ;
}).get();

B. (The hack, hard to reason)
Type a = switch (0) {
default -> {
.
.
yield .. ;
}
};

C. (Does not exist)
Type a = {
.
.
yield .. ;
}

All of them are equal in power, yet C is compact and readable. Of course, no
one will resort to A or B, instead they will compute in the same block as
the
variable in question (a). Unfortunately, that has a few drawbacks:
1. Pollution of the block with temporary variables that won't be needed
further
down the line, and at the same time, prohibiting the usage of the same
name,
especially if they were final.
2. Hard to reason where statements and computations start and end, without
resorting to comments.

Now, with limited power we can have:

D. (Legal)
Type a; {
.
.
a = .. ;
}

Actually D has fewer characters than C, for small enough variable names. It
also
solves the pollution problems stated above. But, as already mentioned, it
lacks
in power. We can't use it in places where only 1 statement is permitted,
like
computing the parameter of a method inside its parenthesis.

With C, we can even initialize instance variables (or static) that require
multiple statements, without resorting to an instance initializer block.
Basically, computing a value ad-hoc that fits everywhere.

The block right of the arrow -> of a switch case is basically it. It just
needs
a promotion to stand on its own.

Fwd: Yielding blocks

2022-03-04 Thread Brian Goetz


The following was received on the -comments list.

Summary: "Can we please have block expressions, you're almost there with 
the yielding block in switch expressions."


My observations: There was some discussion around the time we did switch 
expressions about whether we wanted a general-purpose block expression; 
it was certainly clear that we were inventing a limited form of block 
expression, and the worst thing about what we did back then was open the 
door to block expressions a bit, raising the inevitable questions about 
"why not just throw open the door".  While I can imagine wanting to open 
up on this at some point in the future, and am sympathetic to the notion 
that we want to express some things as expressions that currently 
require statements, I'm still not particularly motivated to throw open 
this door at this time.



 Forwarded Message 
Subject:Yielding blocks
Date:   Fri, 4 Mar 2022 01:22:19 +0200
From:   Dimitris Paltatzidis 
To: amber-spec-comme...@openjdk.java.net



Methods, ignoring their arguments, form 2 categories:
1. Those that do not return a value (void).
2. Those that do.

Now, stripping away their signature, we get anonymous methods. Java supports
category 1, "blocks". They can be local, instance initializers or even
static.
On the other end, category 2 is missing. Consider the examples below of
trying
to initialize a local variable, with a complex computation.

A. (Verbose and garbage producing)
Type a = ((Supplier) () -> {
.
.
return .. ;
}).get();

B. (The hack, hard to reason)
Type a = switch (0) {
default -> {
.
.
yield .. ;
}
};

C. (Does not exist)
Type a = {
.
.
yield .. ;
}

All of them are equal in power, yet C is compact and readable. Of course, no
one will resort to A or B, instead they will compute in the same block as
the
variable in question (a). Unfortunately, that has a few drawbacks:
1. Pollution of the block with temporary variables that won't be needed
further
down the line, and at the same time, prohibiting the usage of the same
name,
especially if they were final.
2. Hard to reason where statements and computations start and end, without
resorting to comments.

Now, with limited power we can have:

D. (Legal)
Type a; {
.
.
a = .. ;
}

Actually D has fewer characters than C, for small enough variable names. It
also
solves the pollution problems stated above. But, as already mentioned, it
lacks
in power. We can't use it in places where only 1 statement is permitted,
like
computing the parameter of a method inside its parenthesis.

With C, we can even initialize instance variables (or static) that require
multiple statements, without resorting to an instance initializer block.
Basically, computing a value ad-hoc that fits everywhere.

The block right of the arrow -> of a switch case is basically it. It just
needs
a promotion to stand on its own.

Re: [External] : Re: Telling the totality story

2022-03-04 Thread Brian Goetz




On 3/4/2022 5:37 AM, fo...@univ-mlv.fr wrote:





*From: *"Brian Goetz" 
*To: *"Remi Forax" 
*Cc: *"amber-spec-experts" 
*Sent: *Friday, March 4, 2022 3:23:58 AM
*Subject: *Re: [External] : Re: Telling the totality story



 Let

This is where it gets ugly.  Let has no opinions about
null, but the game here is to pretend it does.  So in a
let statement:

    let P = e

 - Evaluate e
 - If e is null
   - If P is a covering pattern, its binding is bound to null;
   - else we throws NPE
 - Otherwise, e is matched to P.  If it does not match
(its remainder), a MatchException is thrown (or the else
clause is taken, if one is present.)


It's not clear to me why a MatchException should be thrown
instead of not compiling if not exhaustive.


You're confusing "exhaustive" and "total".  A let must be
exhaustive, but exhaustiveness != totality.

It's mean that there are remainders that does not lead to
either a NPE or an error, do you have an example of such
remainder ?


Yes.  Suppose we have records Box(T t) and Pair(T t, U
u), and A is sealed to B|C.  Then if we're matching on a
Pair, A>, then

    Pair(null, B)
    Pair(Box(B), D)  // D is a type from the future
    Pair(Box(D), B)
    Pair(Box(D), D)
    Pair(null, D)

are all in the remainder.  It is a big stretch to claim either NPE
or ICCE is right for any of these, and completely arbitrary to
pick one for Pair(null, D).  Also, its much more expensive to try
to distinguish between these, and pick the "right" error for each,
rather than insert a default clause that throws
MatchRemainderException. 



The exception are here because the view at runtime and the view at 
compile time are slightly different
- an NPE is raised when a value is null but the pattern ask for a 
deconstruction
- an ICCE if the compiler has use a sealed type during it's analysis 
and the sealed type has changed


This was a thread about pedagogy.  You asked a factual question (though 
tangential) for clarification, which is fine, and got an answer.  But 
then you used the answer to spin off into a "let me redesign a feature 
that is unrelated to the thread."  That's not so good.


There actually was a thread about exceptions, where we could have 
discussed this (and we did, I think; these points here mostly seem 
repeated.)  By diverting this thread, it means we may not get back to 
the main point -- something I put time into because I wanted feedback.  
So far, we're three replies-to-replies into the weeds, and no one has 
talked about pedagogy.  And likely, no one will, because once a thread 
has been diverted, it rarely comes back.

Re: [External] : Re: Telling the totality story

2022-03-03 Thread Brian Goetz





 Let

This is where it gets ugly.  Let has no opinions about null, but
the game here is to pretend it does.  So in a let statement:

    let P = e

 - Evaluate e
 - If e is null
   - If P is a covering pattern, its binding is bound to null;
   - else we throws NPE
 - Otherwise, e is matched to P.  If it does not match (its
remainder), a MatchException is thrown (or the else clause is
taken, if one is present.)


It's not clear to me why a MatchException should be thrown instead of 
not compiling if not exhaustive.


You're confusing "exhaustive" and "total".  A let must be exhaustive, 
but exhaustiveness != totality.


It's mean that there are remainders that does not lead to either a NPE 
or an error, do you have an example of such remainder ?


Yes.  Suppose we have records Box(T t) and Pair(T t, U u), and 
A is sealed to B|C.  Then if we're matching on a Pair, A>, then


    Pair(null, B)
    Pair(Box(B), D)  // D is a type from the future
    Pair(Box(D), B)
    Pair(Box(D), D)
    Pair(null, D)

are all in the remainder.  It is a big stretch to claim either NPE or 
ICCE is right for any of these, and completely arbitrary to pick one for 
Pair(null, D).  Also, its much more expensive to try to distinguish 
between these, and pick the "right" error for each, rather than insert a 
default clause that throws MatchRemainderException.

Re: [External] : Re: Proposal: java.lang.runtime.Carrier

2022-03-03 Thread Brian Goetz





Either way, we don't need to mutate or replace carriers. 



You want the same carrier for the whole pattern matching:


I think you're going about this backwards.  You seem to have a clear 
picture of how pattern matching "should" be translated.  If so, you 
should share!  Maybe your way is better.  But you keep making statements 
like "we need" and "we want" without explaining why.


- if you have a logical OR between patterns (not something in the 
current Java spec but Python, C# or clojure core.match have it so we 
may want to add an OR in the future)


OR combinators are a good point, but they can be done without a with 
operation.


- if different cases starts with the same prefix of patterns, so you 
don't have to re-execute the de-constructors/pattern methods of the 
prefix several times


Agree that optimizing away multiple invocations is good, but again, I 
don't see that as being coupled to the pseudo-mutability of the carrier.



Perhaps you should start with how you see translation working?

Re: [External] : Re: Primitive type patterns

2022-03-03 Thread Brian Goetz

I read JLS 5.2 more carefully and discovered that while assignment 
context supports primitive narrowing from int-and-smaller to 
smaller-than-that:


    byte b = 0

it does not support primitive narrowing from long to int:

    int x = 0L  // error

My best guess at rationale is that because there is no suffix for 
int/short/byte, then int literals are like "poly expressions" but long 
literals are just long literals.  That's an irritating asymmetry (but, 
fixable.)


In addition, if the expression is a constant expression (§15.29) of 
type byte, short,

char, or int:
• A narrowing primitive conversion may be used if the variable is of 
type byte,
short, or char, and the value of the constant expression is 
representable in the

type of the variable.
• A narrowing primitive conversion followed by a boxing conversion may 
be used
if the variable is of type Byte, Short, or Character, and the value of 
the constant

expression is representable in the type byte, short, or char respectively.




On 3/3/2022 10:17 AM, Dan Heidinga wrote:

On Wed, Mar 2, 2022 at 3:13 PM Brian Goetz  wrote:



On 3/2/2022 1:43 PM, Dan Heidinga wrote:

Making the pattern match compatible with assignment conversions makes
sense to me and follows a similar rationale to that used with
MethodHandle::asType following the JLS 5.3 invocation conversions.
Though with MHs we had the ability to add additional conversions under
MethodHandles::explicitCastArguments. With pattern matching, we don't
have the same ability to make the "extra" behaviour opt-in / opt-out.
We just get one chance to pick the right behaviour.

Indeed.  And the thing that I am trying to avoid here is creating _yet
another_ new context in which a different bag of ad-hoc conversions are
possible.  While it might be justifiable from a local perspective to say
"its OK if `int x` does unboxing, but having it do range checking seems
new and different, so let's not do that", from a global perspective,
that means we a new context ("pattern match context") to add to
assignment, loose invocation, strict invocation, cast, and numeric
contexts.  That is the kind of incremental complexity I'd like to avoid,
if there is a unifying move we can pull.

I'm in agreement on not adding new contexts but I had the opposite
impression here.  Doesn't "having it do range checking" require a new
context as this is different from what assignment contexts allow
today?  Or is it the case that regular, non-match assignment must be
total with no left over that allows them to use the same context
despite not being able to do the dynamic range check?  As this
sentence shows, I'm confused on how dynamic range checking fits in the
existing assignment context.

Or are we suggesting that assignment allows:

byte b = new Long(5);

to succeed if we can unbox + meet the dynamic range check?  I'm
clearly confused here.


Conversions like unboxing or casting are burdened by the fact that they
have to be total, which means the "does it fit" / "if so, do it" / "if
not, do something else (truncate, throw, etc)" all have to be crammed
into a single operation.  What pattern matching is extracts the "does it
fit, and if so do it" into a more primitive operation, from which other
operations can be composed.

Is it accurate to say this is less reusing assignment context and more
completely replacing it with a new pattern context from which
assignment can be built on top of?


At some level, what I'm proposing is all spec-shuffling; we'll either
say "a widening primitive conversion is allowed in assignment context",
or we'll say that primitive `P p` matches any primitive type Q that can
be widened to P.  We'll end up with a similar number of rules, but we
might be able to "shake the box" to make them settle to a lower energy
state, and be able to define (whether we explicitly do so or not)
assignment context to support "all the cases where the LHS, viewed as a
type pattern, are exhaustive on the RHS, potentially with remainder, and
throws if remainder is encountered."  (That's what unboxing does; throws
when remainder is encountered.)

Ok. So maybe I'm not confused.  We'd allow the `byte b = new Long(5);`
code to compile and throw not only on a failed unbox, but also on a
dynamic range check failure.

If we took this "dynamic hook" behaviour to the limit, what other new
capabilities does it unlock?  Is this the place to connect other
user-supplied conversion operations as well?  Maybe I'm running too
far with this idea but it seems like this could be laying the
groundwork for other interesting behaviours.  Am I way off in the
weeds here?


As to the range check, it has always bugged me that you see code that
looks like:

  if (i >= -127 && i <= 128) { byte b = (byte) i; ... }

because of the accidental specificity, and the attendant risk of error
(using <= instead of

Telling the totality story

2022-03-03 Thread Brian Goetz

Given the misconceptions about totality and whether "pattern matching 
means the same thing in all contexts", it is pretty clear that the story 
we're telling is not evoking the right concepts.  It is important not 
only to have the right answer, but also to have the story that helps 
people understand why its right, so let's take another whack at that.


(The irony here is that it only makes a difference at one point -- null 
-- which is the least interesting part of the story.  So it is much 
sound and fury that signifies nothing.)


None of what we say below represents a change in the language; it is 
just test-driving a new way to tell the same story.  (From a 
mathematical and specification perspective, it is a much worse story, 
since it is full of accidental detail about nulls (among other sins), 
but it might make some people happier, and maybe we can learn something 
from the reaction.)


The key change here is to say that every pattern-matching construct 
needs special rules for null.  This is already true, so we're just 
doubling down on it.  None of this changes anything about how it works.


 Instanceof

We define:

    x instanceof P === (x != null) && x matches P

This means that instanceof always says false on null; on non-null, we 
can ask the pattern.  This is consistent with instanceof today.  P will 
just never see a null in this case. We'll define "matches" soon.


 Switch

We play the same game with switch.  A switch:

    switch (x) {
    case P1:
    case P2:
    ...
    }

is defined to mean:

 - if x is null
   - if one of P1..Pn is the special label "null", then we take that branch
   - else we throw NPE
 - otherwise, we start trying to match against P1..Pn sequentially
 - if none match (its remainder), throw MatchException

This is consistent with switch today, in that there are no null labels, 
so null always falls into the NPE bucket.  It is also easy to keep track 
of; if the switch does not say "case null", it does not match null.  
None of the patterns P1..Pn will ever see a null.


 Covering patterns

Let's define what it means for a pattern to cover a type T. This is just 
a new name for totality; we say that an "any" pattern covers all types 
T, and a type pattern `U u` covers T when `T <: U`.  (When we do 
primitive type patterns, there will be more terms in this list, but 
they'll still be type patterns.)  No other pattern is a covering 
pattern.  Covering patterns all have a single binding (`var x`, `String 
s`).


 Let

This is where it gets ugly.  Let has no opinions about null, but the 
game here is to pretend it does.  So in a let statement:


    let P = e

 - Evaluate e
 - If e is null
   - If P is a covering pattern, its binding is bound to null;
   - else we throws NPE
 - Otherwise, e is matched to P.  If it does not match (its remainder), 
a MatchException is thrown (or the else clause is taken, if one is present.)


 Nesting

Given a nested pattern R(P), where R is `record R(T t) { }`, this means:

 - Match the target to record R, and note the resulting t
   - if P is a covering pattern, we put the resulting `t` in the 
binding of P without further matching

   - Otherwise, we match t to P

In other words: when a "non trivial" (i.e., non covering) pattern P is 
nested in a record pattern, then:


    x instanceof R(P) === x instanceof R(var a) && a instanceof P

and a covering pattern is interpreted as "not really a pattern", and we 
just slam the result of the outer binding into the inner binding.


 Matching

What we've done now is ensure that no pattern ever actually encounters a 
null, because the enclosing constructs always filter the nulls out.  
This means we can say:


 - var patterns match everything
 - `T t` is gated by `instanceof T`
 - R(P) is gated by `instanceof R`


## Is this a better way to tell the story?

This is clearly a much more complex way to tell the story; every 
construct must have a special case for null, and we must be prepared to 
treat simple patterns (`var x`, `String s`) not always as patterns, but 
sometimes as mere declarations whose bindings will be forcibly crammed 
from "outside".


Note too this will not scale to declared total patterns that treat null 
as a valid target value, if we ever want them; it depends on being able 
to "peek inside" the pattern from the outside, and say "oh, you're a 
covering pattern, I can set your binding without asking you."  That 
won't work with declared total patterns, which are opaque and may have 
multiple bindings.  (Saying "this doesn't bother me" is saying "I'm fine 
with a new null gate for total declared patterns.")


It also raises the profile of null in the story, which kind of sucks; 
while it is not new "null gates", it is new stories that is prominently 
parameterized by null.  (Which is a funny outcome for a system 
explicitly designed to *not* create new null gates!)



 Further editorializing

So, having written it out, I dislike it

Re: [External] : Re: Proposal: java.lang.runtime.Carrier

2022-03-03 Thread Brian Goetz





For the pattern matching,
we also need a 'with' method, that return a method handle that
takes a carrier and a value and return a new carrier with the
component value updated.


It is not clear to me why we "need" this.  Rather than jumping
right to "Here is the solution", can you instead try to shine some
light on the problem you are trying to solve? 



When you have nested record patterns, each of these patterns 
contribute to introduce bindings, so when executing the code of the 
pattern matching, the code that match a nested pattern needs to add 
values into the carrier object. Given that the Carrier API is non 
mutable, we need the equivalent of a functional setter, a wither.




I don't think we need to do this.

Recall what nested patterns means: if R(T t) is a record, and Q is a 
pattern that is not total, then


    x matches R(Q)

means

    x matches R(T t) && t matches Q

So if we have

    record R(S s) { }
    record S(int a, int b) { }

then

    case R(S(var a, var b))

operates by matching the target to R, deconstructing with the R 
deconstructor, which yields a carrier of shape (S).  Then we further 
match the first component of this carrier to the S deconstructor, which 
yields a carrier of shape (II).  No mutation needed.


Note that this unrolling can happen in one of two ways:

 - The compiler just unrolls it doing plain vanilla compiler stuff
 - A pattern runtime has a nesting combinator that takes a pattern 
description for an outer and an inner pattern, which when evaluated with 
R and S, yields a carrier of shape (R;S;II), the compiler evaluates this 
nesting combinator with condy, and uses that to do the match.


Either way, we don't need to mutate or replace carriers.

Re: [External] : Re: Proposal: java.lang.runtime.Carrier

2022-03-03 Thread Brian Goetz




For the pattern matching,
we also need a 'with' method, that return a method handle that takes a 
carrier and a value and return a new carrier with the component value 
updated.


It is not clear to me why we "need" this.  Rather than jumping right to 
"Here is the solution", can you instead try to shine some light on the 
problem you are trying to solve?


In term of spec, Jim, can you rename "component getter" to "component 
accessor" which is the term used by records.


"Accessor" is a perfectly OK term, but remember, these are not records, 
and "record component" means something.  I think its OK to use accessor 
here, not because its what records do, but because we don't need to give 
people the idea that this has something to do with beans.

Re: [External] : Re: Primitive type patterns

2022-03-03 Thread Brian Goetz





I'm in agreement on not adding new contexts but I had the opposite
impression here.  Doesn't "having it do range checking" require a new
context as this is different from what assignment contexts allow
today?  Or is it the case that regular, non-match assignment must be
total with no left over that allows them to use the same context
despite not being able to do the dynamic range check?  As this
sentence shows, I'm confused on how dynamic range checking fits in the
existing assignment context.

Or are we suggesting that assignment allows:

byte b = new Long(5);

to succeed if we can unbox + meet the dynamic range check?  I'm
clearly confused here.


At a meta level, the alignment target is:
   - given a target type `T`
   - given an expression `e : E`

then:
  - being able to statically determine whether `T t` matches `e` should 
be equivalent to whether the assignment `T t = e` is valid under the 
existing 5.2 rules.


That is to say, the existing 5.2 rules may look like a bag of ad-hoc, 
two-for-one-on-tuesday rules, but really, they will be revealed to be 
the set of conversions that are consistent with statically determining 
whether `T t` matches `e : E`.  Most of these rules involve only T and E 
(e.g., widening primitive conversion), but one of them is about ranges, 
which we can only statically assess when `e` is a constant.





Conversions like unboxing or casting are burdened by the fact that they
have to be total, which means the "does it fit" / "if so, do it" / "if
not, do something else (truncate, throw, etc)" all have to be crammed
into a single operation.  What pattern matching is extracts the "does it
fit, and if so do it" into a more primitive operation, from which other
operations can be composed.

Is it accurate to say this is less reusing assignment context and more
completely replacing it with a new pattern context from which
assignment can be built on top of?


Yes!  Ideally, this is one of those "jack up the house and provide a 
solid foundation" moves.





At some level, what I'm proposing is all spec-shuffling; we'll either
say "a widening primitive conversion is allowed in assignment context",
or we'll say that primitive `P p` matches any primitive type Q that can
be widened to P.  We'll end up with a similar number of rules, but we
might be able to "shake the box" to make them settle to a lower energy
state, and be able to define (whether we explicitly do so or not)
assignment context to support "all the cases where the LHS, viewed as a
type pattern, are exhaustive on the RHS, potentially with remainder, and
throws if remainder is encountered."  (That's what unboxing does; throws
when remainder is encountered.)

Ok. So maybe I'm not confused.  We'd allow the `byte b = new Long(5);`
code to compile and throw not only on a failed unbox, but also on a
dynamic range check failure.


No ;)

Today, we would disallow this assignment because it is not an unboxing 
followed by a primitive widening.  (The opposite, long l = new Byte(3), 
would be allowed today, except that we took away these constructors so 
you have to use valueOf.)  We would only allow a narrowing if the RHS 
were a constant, like "5", in which case the compiler would statically 
evaluate the range check and narrow 5 to byte.


Tomorrow, the assignment would be the same; assignment works based on 
"statically determined to match", and we can only statically determine 
the range check if we know the target value, i.e., its a constant.  But, 
if you *asked*, then you can get a dynamic range check:


    if (anInt matches byte b) // we get a range check here

The reason we don't do that with assignment is we don't know what to do 
if it doesn't match.  But if its in a conditional context (if or 
switch), then the programmer is going to tell us what to do if it 
doesn't match.



If we took this "dynamic hook" behaviour to the limit, what other new
capabilities does it unlock?  Is this the place to connect other
user-supplied conversion operations as well?  Maybe I'm running too
far with this idea but it seems like this could be laying the
groundwork for other interesting behaviours.  Am I way off in the
weeds here?


Not entirely in the weeds.  The problem with assignment, casting, and 
all of those things is that they have to be total; when you say "x = y" 
then the guarantee is that *something* got assigned to x. Now, we are 
already cheating a bit, because `x = y` allows unboxing, and unboxing 
can throw.  (Sounds like remainder rejection!)   Now, imagine we had an 
"assign or else" construct (with static types A and B):


    a := (b, e)

then this would mean

    if (b matches A aa)
    a = aa
    else
    a = e  // and maybe e is really a function of b

In the case of unboxing conversions, our existing assignment works kind 
of like:


    a := (b, throw new NPE)

because we'd try to match, and if it fails, evaluate the second 
component, which throws.


Obviously I'm not suggesting we tinker with

Re: Proposal: java.lang.runtime.Carrier

2022-03-03 Thread Brian Goetz


Thanks Jim.

As background, (some form of) this code originated in a prototype for 
pattern matching, where we needed a carrier for a tuple (T, U, V) to 
carry the results of a match from a deconstruction pattern (or other 
declared pattern) on the stack as a return value.  We didn't want to 
spin a custom class per pattern, and we didn't want to commit to the 
actual layout, because we wanted to preserve the ability to switch later 
to a value class.  So the idea is you describe the carrier you want as a 
MethodType, and there's a condy that gives you an MH that maps that 
shape of arguments to an opaque carrier (the constructor), and other 
condys that give you MHs that map from the carrier to the individual 
bindings.  So pattern matching will stick those MHs in CP slots.


The carrier might be some bespoke thing (e.g., record anon(T t, U u, V 
v)), or something that holds an Object[], or something with three int 
fields and two ref fields, or whatever the runtime decides to serve up.


The template mechanism wants almost exactly the same thing for bundling 
the parameters for uninterprted template strings.


Think of it as a macro-box; instead of boxing primitives to Object and 
Objects to varargs, there's a single boxing operation from a tuple to an 
opaque type.




On 3/3/2022 8:57 AM, Jim Laskey wrote:


We propose to provide a runtime /anonymous carrier class object 
generator/; *java.lang.runtime.Carrier*. This generator class is 
designed to share /anonymous classes/ when shapes are similar. For 
example, if several clients require objects containing two integer 
fields, then *Carrier* will ensure that each client generates carrier 
objects using the same underlying anonymous class.


Providing this mechanism decouples the strategy for carrier class 
generation from the client facility. One could implement one class per 
shape; one class for all shapes (with an Object[]), or something in 
the middle; having this decision behind a bootstrap means that it can 
be evolved at runtime, and optimized differently for different situations.



  Motivation

The String Templates JEP draft 
 proposes the 
introduction of a /TemplatedString/ object for the primary purpose of 
/carrying/ the /template/ and associated /values/ derived from a 
/template literal/. To avoid value boxing, early prototypes described 
these /carrier/objects using /per-callsite/ anonymous classes shaped 
by value types, The use of distinct anonymous classes here is 
overkill, especially considering that many of these classes are 
similar; containing one or two object fields and/or one or two 
integral fields. /Pattern matching/ has a similar issue when carrying 
the values for the /holes/ of a pattern. With potentially hundreds 
(thousands?) of template literals or patterns per application, we need 
to find an alternate approach for these /value carriers/.



  Description

In general terms, the *Carrier* class simply caches anonymous classes 
keyed on shape. To further increase similarity in shape, the ordering 
of value types is handled by the API and not in the underlying 
anonymous class. If one client requires an object with one object 
value and one integer value and a second client requires an object 
with one integer value and one object value, then both clients will 
use the same underlying anonymous class. Further, types are folded as 
either integer (byte, short, int, boolean, char, float), long (long, 
double) or object. [We've seen that performance hit by folding the 
long group into the integer group is significant, hence the separate 
group.]


The *Carrier* API uses MethodType parameter types to describe the 
shape of a carrier. This incorporates with the primary use case where 
bootstrap methods need to capture indy non-static arguments. The API 
has three static methods;


|// Return a constructor MethodHandle for a carrier with components // 
aligning with the parameter types of the supplied methodType. static 
MethodHandle constructor(MethodType methodType) // Return a component 
getter MethodHandle for component i. static MethodHandle 
component(MethodType methodType, int i) // Return component getter 
MethodHandles for all the carrier's components. static MethodHandle[] 
components(MethodType methodType)|



  Examples

|import java.lang.runtime.Carrier; ... // Define the carrier 
description. MethodType methodType = 
MethodType.methodType(Object.class, byte.class, short.class, 
char.class, int.class, long.class, float.class, double.class, 
boolean.class, String.class); // Fetch the carrier constructor. 
MethodHandle constructor = Carrier.constructor(methodType); // Create 
a carrier object. Object object = 
(Object)constructor.invokeExact((byte)0xFF, (short)0x, 'C', 
0x, 0xL, 1.0f / 3.0f, 1.0 / 3.0, true, 
"abcde"); // Get an array of accessors for the carrier object. 
MethodHandle[] components =

Re: [External] : Re: Primitive type patterns

2022-03-02 Thread Brian Goetz





On 3/2/2022 1:43 PM, Dan Heidinga wrote:


Making the pattern match compatible with assignment conversions makes
sense to me and follows a similar rationale to that used with
MethodHandle::asType following the JLS 5.3 invocation conversions.
Though with MHs we had the ability to add additional conversions under
MethodHandles::explicitCastArguments. With pattern matching, we don't
have the same ability to make the "extra" behaviour opt-in / opt-out.
We just get one chance to pick the right behaviour.


Indeed.  And the thing that I am trying to avoid here is creating _yet 
another_ new context in which a different bag of ad-hoc conversions are 
possible.  While it might be justifiable from a local perspective to say 
"its OK if `int x` does unboxing, but having it do range checking seems 
new and different, so let's not do that", from a global perspective, 
that means we a new context ("pattern match context") to add to 
assignment, loose invocation, strict invocation, cast, and numeric 
contexts.  That is the kind of incremental complexity I'd like to avoid, 
if there is a unifying move we can pull.


Conversions like unboxing or casting are burdened by the fact that they 
have to be total, which means the "does it fit" / "if so, do it" / "if 
not, do something else (truncate, throw, etc)" all have to be crammed 
into a single operation.  What pattern matching is extracts the "does it 
fit, and if so do it" into a more primitive operation, from which other 
operations can be composed.


At some level, what I'm proposing is all spec-shuffling; we'll either 
say "a widening primitive conversion is allowed in assignment context", 
or we'll say that primitive `P p` matches any primitive type Q that can 
be widened to P.  We'll end up with a similar number of rules, but we 
might be able to "shake the box" to make them settle to a lower energy 
state, and be able to define (whether we explicitly do so or not) 
assignment context to support "all the cases where the LHS, viewed as a 
type pattern, are exhaustive on the RHS, potentially with remainder, and 
throws if remainder is encountered."  (That's what unboxing does; throws 
when remainder is encountered.)


As to the range check, it has always bugged me that you see code that 
looks like:


    if (i >= -127 && i <= 128) { byte b = (byte) i; ... }

because of the accidental specificity, and the attendant risk of error 
(using <= instead of <, or using 127 instead of 128). Being able to say:


    if (i instanceof byte b) { ... }

is better not because it is more compact, but because you're actually 
asking the right question -- "does this int value fit in a byte."  I'm 
sad we don't really have a way to ask this question today; it seems an 
omission.



Intuitively, the behaviour you propose is kind of what we want - all
the possible byte cases end up in the byte case and we don't need to
adapt the long case to handle those that would have fit in a byte.
I'm slightly concerned that this changes Java's historical approach
and may lead to surprises when refactoring existing code that treats
unbox(Long) one way and unbox(Short) another.  Will users be confused
when the unbox(Long) in the short right range ends up in a case that
was only intended for unbox(Short)?  I'm having a hard time finding an
example that would trip on this but my lack of imagination isn't
definitive =)


I'm worried about this too.  We examined it briefly, and ran away, when 
we were thinking about constant patterns, specifically:


    Object o = ...
    switch (o) {
    case 0: ...
    default: ...
    }

What would this mean?  What I wouldn't want it to mean is "match Long 0, 
Integer 0, Short 0, Byte 0, Character 0"; that feels like it is over the 
line for "magic".  (Note that this is about defining what the _constant 
pattern_ means, not the primitive type pattern.) I think its probably 
reasonable to say this is a type error; 0 is applicable to primitive 
numerics and their boxes, but not to Number or Object.  I think that is 
consistent with what I'm suggesting about primitive type patterns, but 
I'd have to think about it more.



Something like following shouldn't be surprising given the existing
rules around unbox + widening primitive conversion (though it may be
when first encountered as I expect most users haven't really
internalized the JLS 5.2 rules):


As Alex said to me yesterday: "JLS Ch 5 contains many more words than 
any prospective reader would expect to find on the subject, but once the 
reader gets over the overwhelm of how much there is to say, will find 
none of the words surprising."  There's a deeper truth to this 
statement: Java is not actually as simple a language as its mythology 
suggests, but we win by hiding the complexity in places users generally 
don't have to look, and if and when they do confront the complexity, 
they find it unsurprising, and go back to ignoring it.


So in point of fact, *almost no one* has read JLS 5.2, but it still

Re: [External] : Re: Primitive type patterns

2022-03-02 Thread Brian Goetz





On 3/2/2022 2:36 PM, fo...@univ-mlv.fr wrote:


There are two ways to express "match non null Integer + unboxing",
this one
   Integer value = ...
   switch(value) {
 case Integer(int i) -> ...
   }

And we already agree that we want that syntax.


Wait, what?  The above is not yet on the table; we will have to wait for 
deconstruction patterns on classes to be able to express that. When we 
get there, we'll have a choice of whether we want to add a deconstructor 
to the wrapper classes.  (At which point, you might well say "we already 
have a way to do that"...)



You are proposing a new one
   Integer value = ...
   switch(value) {
 case int i -> ...
   }


But if it was on the table now, it would still not be particularly 
notable as a "new way" of anything; this would be true for *every single 
nested pattern*.  In fact, that's a feature, not a bug, that you can 
unroll a switch with a non-total nested pattern to a nested switch, and 
that is a desirable refactoring to support.



Obviously, your proposal makes things less simple because we new have to ways 
to say the same thing.


Not obvious at all.  Java's simplicity does not derive from "exactly one 
way to do each thing"; suggesting otherwise is muddying the terms for 
rhetorical effect, which is not helpful.



I think we do not need assignment conversions*and*  that introducing them makes 
the semantics harder to understand.


The former is a valid opinion, and is noted!

Re: [External] : Re: Primitive type patterns

2022-02-28 Thread Brian Goetz

This is a valid generalized preference (and surely no one is going to 
say "no, I prefer to play to our weaknesses.")  But at root, I think 
what you are saying is that you would prefer that pattern matching 
simply be a much smaller and less fundamental feature than what is being 
discussed here.  And again, while I think that's a valid preference, I 
think the basis of your preference is that it is "simpler", but I do not 
think it actually delivers the simplicity dividend you are hoping for, 
because there will be subtle mismatches that impede composition and 
refactoring (e.g., new "null gates" and "box gates".)


In any case, it's clear that nearly everything about the way we've 
designed pattern matching is "not how you would have done it"; you don't 
like the totality, exhaustiveness, error handling, and conversion 
rules.  And that's fine, but I think we're spending too much effort on 
"I wish there were a different design center".  What I'd like to get 
feedback on is:


 - Is there anything *specifically wrong* with what is being proposed?  
Did I extrapolate incorrectly from the rules for assignment conversion, 
for example?
 - How can we come to a better way of presenting the material, so that 
people don't fall into the same pothole regarding "pattern matching 
means different things in different places", for example?


Would also like to hear from other people

I think we should play on our strengths, destructuring instead of 
unboxing, pattern methods instead of primitive conversions + rangecheck.
I prefer to have a few small well defined patterns (type pattern, type 
destructuring, pattern method) each with a simple semantics (so keep 
the type pattern to be about subtyping relationship only) and draw 
power from the ability to compose them instead of having patterns with 
a huge semantic baggage (type pattern with the assignment semantics) 
and the corner cases that come with it.


We may still need assignment conversions when we mix pattern and 
assignment, but because we want to be "backward compatibility" with 
the simple assignment semantics.

Re: [External] : Re: Primitive type patterns

2022-02-28 Thread Brian Goetz




Now, what if instead of Object, we start with Long?

    Long l = 0L
    if (l instanceof byte b) { ... }

First, applicability: does Long unbox to a primitive type that can
be narrowed to byte?  Yes!  Long unboxes to long, and long can be
narrowed to byte.

Then: matching: if the RHS is not null, we unbox, and do a range
check.  (The rules in my previous mail probably didn't get this
case perfectly right), but 0L will match, and 0x will not --
as we would expect. 



This is totally alien to me, when you have x instanceof Foo (note: 
this is not the pattern version) with X the type of x, then if x is 
declared with a super type of X it works the exact same way, i.e i 
don't have to care to much about the type on what i'm doing an 
instanceof / switching over it.


Yes, I understand your discomfort.  And I will admit, I don't love this 
particular corner-of-a-corner either.  (But let's be clear: it is a 
corner.  If you're seeking to throw out the whole scheme on the basis 
that corners exist, you'll find the judge to be unsympathetic.)


So why have I proposed it this way?  Because, unfortunately, of this 
existing line in JLS 5.2 (which I never liked):


> an unboxing conversion followed by a widening primitive conversion

This is what lets you say:

    long l = anInteger

And, I never liked this rule, but we're stuck with it.  The inverse, 
from which we would derive this rule, is that


    anInteger instanceof long l

should be applicable, and in fact always match when the LHS is 
non-null.  I would prefer to not allow this assignment conversion, and 
similarly not allow both unboxing and widening in one go in pattern 
matching, but I didn't get to write JLS 5.2.


What's new here is going in the *other* direction:

    anInteger instanceof short s

and I think what is making you uncomfortable is that you are processing 
two generalizations at once, and it's pushing your "OMG different! 
scary!" buttons:


 - that we're defining primitive type patterns in a way such that we 
can derive the existing assignment conversions;
 - that primitive type patterns can have dynamic checks that primitive 
assignments cannot, so we're including the value-range check.


Each individually is not quite as scary, but I can understand why the 
two together would seem scary.  (And, as I mentioned, I don't like the 
unbox-and-widen conversions either, but I didn't invent those.)

Re: [External] : Re: Primitive type patterns

2022-02-28 Thread Brian Goetz





So *of course* there's an obvious definition of how `int x`
matches against Integer, and its not a question of whether we
"define" it that way, its a question of whether we expose the
obvious meaning, or suppress it.  I think the arguments in favor
of suppression are pretty weak.


The strong argument is that instanceof/switch case is about subtyping 
relationship while assignment is about assignment conversions, trying 
to blur the lines between the two has already been tried in the past 
and it results in pain (see below).


This is pretty much assuming your conclusion, and then stating it as 
justification :)


I get it; you would prefer that pattern matching be *only* about 
subtyping.  I understand why you prefer that.  But I think this is 
mostly a "retreat to the comfort zone" thing.




What about ?

Object o =...
if (o instanceof byte) { ... }

Does it means that o can be a Long ?



This is a good question.  (But I know it's also a trap.)  We first have 
to ask about (static) applicability: is the pattern `byte b` applicable 
to Object?  If not, we'll get a compilation error.


My earlier message said:

 - A primitive type pattern `P p` should be applicable to a reference 
target T if T unboxes to P, or T unboxes to a primitive type that can be 
widened to P [ or if T unboxes to a primitive type that can be narrowed 
to P. ]


Does Object unbox to byte?  No.
Does Object unbox to a primitive type that can be widened to byte?  No.
[brackets] Does Object unbox to a primitive type than can be narrowed to 
byte?  No.


How does this square with assignments?  I cannot assign

    byte b = anObject

|  incompatible types: java.lang.Object cannot be converted to byte

If I try this with casting:

   Object o = 0L
   byte b = (byte) o

I get a CCE, because the cast will only convert from Byte.

Now, what if instead of Object, we start with Long?

    Long l = 0L
    if (l instanceof byte b) { ... }

First, applicability: does Long unbox to a primitive type that can be 
narrowed to byte?  Yes!  Long unboxes to long, and long can be narrowed 
to byte.


Then: matching: if the RHS is not null, we unbox, and do a range check.  
(The rules in my previous mail probably didn't get this case perfectly 
right), but 0L will match, and 0x will not -- as we would expect.



We could consider pushing this farther, if we liked, but there's a risk 
of pushing it too far, and I think we're already on the cusp of 
diminishing returns.  We could consider `case byte b` against Object to 
match a Byte.  We discussed this, in fact, a few years ago when we 
talked about "what does `case 0` mean" when we were considering constant 
patterns.  (And we concluded in that discussion that the step where we 
basically treat Long as narrowable to Integer was taking it way too 
far.)  So I think the rules I've specified capture (a) the right set of 
tradeoffs of how loose we want to be with regard to boxing/unboxing 
combined with narrowing/widening, subject to (b) the existing decisions 
we've made about assignments.

Re: Primitive type patterns

2022-02-28 Thread Brian Goetz


Let me put the central question here in the spotlight.


 Boxing and unboxing

Suppose our match target is a box type, such as:

    record IntegerBox(Integer i) { }

Clearly we can match it with:

    case IntegerBox(Integer i): 


We could stop here, and say the pattern `int i` is not applicable to 
`Integer`, which means that `IntegerBox(int i)` is not applicable to 
IntegerBox.  This is a legitimate choice (though, I think it would be a 
bad one, one that we would surely revisit sooner rather than later.)


Users might well not understand why they could not say

    case IntegerBox(int i)

because (a) you can unbox in other contexts and (b) this has a very 
clear meaning, and one analogous to `case ObjectBox(String s)` -- if the 
box contents fits in a String, then match, otherwise don't. They might 
get even more balky if you could say


    int x = anInteger

but not

    let int x = anInteger

And, if we went with both Remi's preferences -- the minimalist matching 
proposal and dropping "let" in the let/bind syntax (which we will NOT 
discuss further here) then the following would be ambiguous:


    int x = anInteger  // is this an assignment, in which case we 
unbox, or a match, in which case we don't?


and we would have to "arbitrarily" pick the legacy interpretation.

But all of this "balking" is symptom, not disease.

The reality is that pattern matching is more primitive than unboxing 
conversions, and if we lean into that, things get simpler.  An unboxing 
conversion may seem like one thing, but is actually two: try to match 
against a partial pattern, and if there is no match, fail.  In other 
words, unboxing is:


    int unbox(Integer n) ->
    switch(n) {
    case int i -> i;
    case null -> throw new NPE();
    }

The unboxing we have jams together the pattern match and the 
throw-on-fail, because it has no choice; unboxing wants to be total, and 
there's no place to specify what to try if the pattern match fails.  But 
pattern matching is inherently conditional, allowing us to build 
different constructs on it which handle failures differently.


So *of course* there's an obvious definition of how `int x` matches 
against Integer, and its not a question of whether we "define" it that 
way, its a question of whether we expose the obvious meaning, or 
suppress it.  I think the arguments in favor of suppression are pretty weak.



There's a similar argument when it comes to narrowing from (say) long to 
int.  There's a very natural interpretation of matching `int x` to a 
long: does the long value fit in an int.  In assignment context, we do 
the best we currently can -- we allow the narrowing only if we can 
statically prove it will match, which mean if the match target is a 
constant.  But again, conversions in assignment context are not the 
primitive.  If we take the obvious definition of matching `int x` 
against long, then the current rules fall out naturally, and we can ask 
sensible questions like


    if (aLong instanceof byte) { ... }
    else if (aLong instanceof short) { ... }

by *asking the primitive directly*, rather than appealing to some proxy 
(like manually unrolling the range check.)

Re: [External] : Re: Primitive type patterns

2022-02-28 Thread Brian Goetz





Nope,
i'm saying that inside a pattern if we let the unboxing to be possible 
with the semantics that if the value is null, instead of throwing a 
NPE if it does not match, we are introducing the equivalent of the 
null-safe operator of Groovy (the elvis operator),  something we 
should not do (and that was rejected in the past, i think as part of 
Coin or Coin 2).


Too silly to respond to.  Let's stay serious, please.

Maybe it means that we should allow unboxing with the semantics that 
it throw a NPE i.e the exact semantics of the assignment conversion 
instead of disallowing unboxing as i proposed, I don't know, but I 
really dislike the semantics you are proposing here.


It seems more likely that you just *don't understand* what is being 
proposed here (and the fact that you keep repeating the incorrect claim 
that pattern matching somehow means different things in different 
constructs underscores that.)


Let me try explaining the latter one more time, because you really do 
seem to have a deep misundestanding of the semantics being proposed, and 
until we can clear that up, I'm not sure there's much point in arguing 
about the finer points.



  if (o instanceof Circle(Point(int x, int y), int radius))
means


... doesn't match ...


while
  Circle(Point(int x, int y), int radius) = c;
means


  ... throws ...

In the context of switch or instanceof, a pattern does not have to be 
total, while in case of a assignment the pattern has to be total,

so depending on the context, the semantics is different.


No, this is still incorrect.  The semantics of the *pattern match* does 
not change.  Ever.  Period.  The pattern Circle(Point(int x, int y), int 
radius) does not match `new Circle(null, 3)`, ever. Ever.  EVER.  Got 
it?  It DOES NOT MATCH.  Pattern matching is 1000% consistent on this 
point.  What you've observed is a statement about instanceof vs switch 
vs let, not about pattern matching at all.


What is different is that we have different constructs (instanceof, 
switch, let, try--catch, etc) which use pattern matching.  One 
(instanceof) evaluates to false when the operand is null (it doesn't 
even try testing the pattern), and doesn't require the pattern to be 
(statically) exhaustive, because it is a boolean expression.  Let, on 
the other hand, requires the pattern to be exhaustive, but tests the 
pattern anyway, and if it does not match, throws (just like switch 
throws on null when there is no null case.)


The difference here is entirely "when does the construct decide to try 
to match" (e.g., switch can throw NPE before trying to match any cases 
if it likes) and "what does the construct do when it runs out of 
patterns to try" (instanceof says false, switch and let throw, 
try...catch would eventually allow the exception to propagate.

Re: [External] : Re: Primitive type patterns

2022-02-26 Thread Brian Goetz





 Relationship with assignment context


That's a huge leap, let's take a step back.

I see two questions that should be answered first.
1) do we really want pattern in case of assignment/declaration to 
support assignment conversions ?
2) do we want patterns used by the switch or instanceof to follow the 
exact same rules as patterns used in assignment/declaration ?


I agree we should take a step back, but let's take a step farther -- 
because I want to make an even bigger leap that you think :)


Stepping way far back  in the beginning ... Java had reference types 
with subtyping, and eight primitive types.  Which raises an immediate 
question: what types can be assigned to what?  Java chose a sensible 
guideline; assignment should be allowed if the value set on the left is 
"bigger" than that on the right.  This gives us String => Object, int => 
long, int => double, etc.  (At this point, note that we've gone beyond 
strict value set inclusion; an int is *not* a floating point number, but 
we chose (reasonably) to do the conversion because we can *embed* the 
ints in the value set of double.   Java was already appealing to the 
notion of embedding-projection pair even then, in assignment 
conversions; assignment from A to B is OK if we have an embedding of A 
into B.)


On the other hand, Java won't let you assign long => int, because it 
might be a lossy conversion.  To opt into the loss, you have to cast, 
which acknowledges that the conversion may be information-losing.  
Except!  If you can prove the conversion isn't information losing 
(because the thing on the right is a compile-time constant), then its 
OK, because we know its safe.  JLS Ch5 had its share of ad-hoc-seeming 
complexity, but mostly stayed in its corner until you called it, and the 
rules all seemed justifiable.


Then we added autoboxing.  And boxing is not problematic; int embeds 
into Integer.  So the conversion from int => Integer is fine. (It added 
more complexity to overload selection, brought in strict and loose 
conversion contexts, and we're still paying when methods like 
remove(int) merge with remove(T), but OK.)  But the other direction is 
problematic; there is one value of Integer that doesn't correspond to 
any value of int, which is our favorite value, null. The decision made 
at the time was to allow the conversion from Integer => int, and throw 
on null.


This was again a justifiable choice, and comes from the fact that the 
mapping from Integer to int is a _projection_, not an embedding.  It was 
decided (reasonably, but we could have gone the other way too) that null 
was a "silly" enough value to justify not requiring a cast, and throwing 
if the silly value comes up.  We could have required a cast from Integer 
to int, as we do from long to int, and I can imagine the discussion 
about why that was not chosen.


Having set the stage, one can see all the concepts in pattern matching 
dancing on it, just with different names.


Whether we can assign T to U with or without a cast, is something we 
needed a static rule for.  So we took the set of type pairs (T, U) for 
which the pattern `T t` is strictly total on U, and said "these are the 
conversions allowed in assignment context" (with a special rule for when 
the target is an integer constant.)


When we got to autoboxing, we made a subjective call that `int x` should 
be "total enough" on `Integer` that we're willing to throw in the one 
place it's not.  That's exactly the concept of "P is exhaustive, but not 
total, on T" (i.e., there is a non-empty remainder.)  All of this has 
happened before.  All of this will happen again.


So the bigger leap I've got in mind is: what would James et al have 
done, had they had pattern matching from day one?  I believe that:


 - T t = u would be allowed if `T t` is exhaustive on the static type of u;
 - If there is remainder, assignment can throw (preserving the 
invariant that if the assignment completes normally, something was 
assigned).


So it's not that I want to align assignment with pattern matching 
because we've got a syntactic construct on the whiteboard that operates 
by pattern matching but happens to looks like assignment; it's because 
assignment *is* a constrained case of pattern matching.  We've found the 
missing primitive, and I want to put it under the edifice.  If we define 
pattern matching correctly, we could rewrite JLS 5.2 entirely in terms 
of pattern matching (whether we want to actually rewrite it or not, 
that's a separate story.)


The great thing about pattern matching as a generalization of assignment 
is that it takes pressure off the one-size-fits-all ruleset.  You can write:


    int x = anInteger

but it might throw NPE.  In many cases, users are fine with that. But by 
interpreting it as a pattern, when we get into more flexible constructs, 
we don't *have* to throw eagerly.  If the user said:


    if (anInteger instanceof int x) { ... }

then we match the pattern

Primitive type patterns

2022-02-25 Thread Brian Goetz

As a consequence of doing record patterns, we also grapple with 
primitive type patterns. Until now, we've only supported reference type 
patterns, which are simple:


 - A reference type pattern `T t` is applicable to a match target of 
type M if M can be cast to T without an unchecked warning.


 - A reference type pattern `T t` covers a match type M iff M <: T

 - A reference type pattern `T t` matches a value m of type M if M <: T 
|| m instanceof T


Two of these three characterizations are static computations 
(applicability and coverage); the third is a runtime test (matching).  
For each kind of pattern, we have to define all three of these.



 Primitive type patterns in records

Record patterns necessitate the ability to write type patterns for any 
type that can be a record component.  If we have:


    record IntBox(int i) { }

then we want to be able to write:

    case IntBox(int i):

which means we need to be able to express type patterns for primitive 
types.



 Relationship with assignment context

There is another constraint on primitive type patterns: the let/bind 
statement coming down the road.  Because a type pattern looks (not 
accidentally) like a local variable declaration, a let/bind we will want 
to align the semantics of "local variable declaration with initializer" 
and "let/bind with total type pattern".  Concretely:


    let String s = "foo";

is a pattern match against the (total) pattern `String s`, which 
introduces `s` into the remainder of the block.  Since let/bind is a 
generalization of local variable declaration with initialization, 
let/bind should align with locals where the two can express the same 
thing.  This means that the set of conversions allowed in assignment 
context (JLS 5.2) should also be supported by type patterns.


Of the conversions supported by 5.2, the only one that applies when both 
the initializer and local variable are of reference type is "widening 
reference", which the above match semantics (`T t` matches `m` when `M 
<: T`) support.  So we need to fill in the other three boxes of the 2x2 
matrix of { ref, primitive } x { ref, primitive }.


The conversions allowed in assignment context are:

 - Widening primitive -- `long l = anInt`
 - Narrowing primitive -- `byte b = 0L` (only applies to constants on RHS)
 - Widening reference -- `Object o = aString`
 - Widening reference + unbox -- where ``, `int i = t`
 - Widening reference + unbox + widening primitive -- `long l = t`
 - Unboxing -- `int i = anInteger` (may NPE)
 - Unboxing + widening primitive -- `long l = anInteger` (may NPE)
 - Boxing -- `Integer i = anInt`
 - Boxing + widening reference -- `Object o = anInt`


 Boxing and unboxing

Suppose our match target is a box type, such as:

    record IntegerBox(Integer i) { }

Clearly we can match it with:

    case IntegerBox(Integer i):

If we want to align with assignment context, and support things like

    let int i = anInteger

(and if we didn't, this would likely be seen as a gratuitous gap between 
let/bind and local declaration), we need for `int i` to be applicable to 
`Integer`:


    case IntegerBox(int i):

There is one value of `Integer` that, when we try to unbox, causes 
trouble: null.  As of Java 5 when switching on wrapper types, we unbox 
eagerly, throwing NPE if the target is null. But pattern matching is 
conditional.  If we have:


    record Box(T t) { }
    Box b;
    ...
    case Box(String s):

when we encounter values of Object that are not instances of String, we 
just don't match.  For unboxing, it should be the same; `int x` matches 
all non-null instances of `Integer`:


    case IntegerBox(int i):

Because `int i` matches all instances of Integer other than null, it is 
reasonable to say that `int i` _covers_ Integer, with remainder null, 
just like:


    Box> bbs;
    switch (bbs) {
    case Box(Box(String s)): ...
    }

covers the match target, with remainder Box(null).  When confronted with 
Box(null), the attempt to match the case doesn't throw, it just doesn't 
match; when we run out of cases, the switch can throw a last-ditch 
exception.  The same applies when unboxing would NPE.


In the other direction, a primitive can always be boxed to its wrapper 
(or a supertype), meaning `Integer x` is applicable to, and covers, int, 
short, char, and byte.



 Primitive widening and narrowing

The pattern match equivalent of primitive widening is:

    let long l = anInt;

or

    case IntBox(long l):

(When we get to dtor patterns, we will have to deal with overload 
selection, but for record patterns, there is one canonical dtor.)  This 
seems uncontroversial, just as allowing `Object o` to match a `String` 
target.


Primitive narrowing is less obvious, but there's a strong argument for 
generalizing primitive narrowing in pattern matching beyond constants.  
We already have to deal with


    let byte b = 0L;

via primitive narrowing, but pattern matching is a conditional

Re: [External] : Re: Record patterns (and beyond): exceptions

2022-02-18 Thread Brian Goetz

We're lost in the weeds; I really can't follow what you're on about 
here, and more replies doesn't seem to be improving it. Since we're 
rapidly heading towards the danger zones I warned about in:


https://mail.openjdk.java.net/pipermail/amber-spec-observers/2020-August/002458.html

I think we should prune this sub-thread and give other folks a chance to 
reply to the main points.



On 2/18/2022 10:07 AM, fo...@univ-mlv.fr wrote:





*From: *"Brian Goetz" 
*To: *"Remi Forax" 
*Cc: *"amber-spec-experts" 
*Sent: *Friday, February 18, 2022 3:34:45 PM
*Subject: *Re: [External] : Re: Record patterns (and beyond):
exceptions


But this clearly does not fall into ICCE.  ICCE means,
basically, "your classpath is borked"; that things that
were known to be true at compile time are not true at
runtime.  (Inconsistent separate compilation is the most
common cause.)  But Box(Bag(null)) is not an artifact of
inconsistent separate compilation. 



I think i've not understood the problem correctly, i was
thinking the error was due to the erasure, Box being
erased to Box, the problem with erasure is that you see the
problem late, in case of the switch after the phase that does
instanceofs, so we end up with ClassCastException instead of ICCE.


CCE is not the right thing either.  Let's step back and go over
the concepts.

We want for the compiler to be able to do type checking that a
switch is "total enough" to not require a default clause. We want
this not just because writing a default clause when you think
you've got things covered is annoying, but also, because once you
have a default clause, you've given up on getting any better type
checking for totality.  In a switch over enum X {A, B}, having
only cases for A and B means that, when someone adds C later,
you'll find out about it, rather than sweeping it under the rug. 
Sealed class hierarchies have the same issues as enums; the
possibility of novel values due to separate compilation.  So far,
all of these could be described by ICCE (and they are, currently.)

We've already talked for several lifetimes about null; switches
that reject null do so with NPE.  That also makes sense.  We had
hoped that this covered the weird values that might leak out of
otherwise-exhaustive switches, but that was wishful thinking.

Having nested deconstruction patterns introduces an additional
layer of weirdness.  Suppose we have

    sealed interface A permits X, Y { }
    Box box;

    switch (box) {
    case Box(X x):
    case Box(Y y):
    }

This should be exhaustive, but we have to deal with two additional
bad values: Box(null), which is neither a Box(A) or a Box(B), and
Box(C), for a novel subtype C.  We don't want to disturb the user
to deal with these by making them have a default clause.

So we define exhaustiveness separately from totality, and
remainder is the difference.  (We can constructively characterize
the upper bound on remainder.)  And we can introduce a throwing
default, as we did with expression switches over enums.  But what
should it throw?

The obvious but naive answer is "well, Box(null) should throw NPE,
and Box(C) should throw ICCE."  But only a few minutes thinking
shows this to be misleading, expensive, and arbitrary.  When we
encountered Box(null), it was not because anyone tried to
dereference a null, so throwing NPE is misleading. 



A NPE is not a problem if (the big if) the error message is "null 
neither match Box(X) nor Box(Y)"


If the shape of the remainder is complicated, this means
generating tons of low-value, compiler-generated boilerplate to
differentiate Box(Bag(null)) from Box(Container()).  That's
expensive.  And, what about Foo(null, C)?  Then we have to
arbitrarily pick one. It's a silly scheme. 



We already talked about that, the shape of the remainder is complex if 
you want to generate all branches at compile time, it's not an issue 
if you generate the branches at runtime, because you can generate them 
lazily.
For some checks, they can only be done at runtime anyway, like does 
this pattern is still total ?


About Foo(null, C), i suppose you mean a case where you have both a 
null that need to be deconstructed and a new subtype, the solution is 
to go left to right, like usual in Java.




So the logical thing to do is to say that these things fall into a
different category from NPE and ICCE, which is that they are
remainder, which gets its own label. 



Nope, as a user i want a real error message, not something saying 
nope, sorry too complex, i bai

Re: [External] : Re: Record patterns (and beyond): exceptions

2022-02-18 Thread Brian Goetz




But this clearly does not fall into ICCE.  ICCE means, basically,
"your classpath is borked"; that things that were known to be true
at compile time are not true at runtime. (Inconsistent separate
compilation is the most common cause.)  But Box(Bag(null)) is not
an artifact of inconsistent separate compilation. 



I think i've not understood the problem correctly, i was thinking the 
error was due to the erasure, Box being erased to Box, the 
problem with erasure is that you see the problem late, in case of the 
switch after the phase that does instanceofs, so we end up with 
ClassCastException instead of ICCE.


CCE is not the right thing either.  Let's step back and go over the 
concepts.


We want for the compiler to be able to do type checking that a switch is 
"total enough" to not require a default clause.  We want this not just 
because writing a default clause when you think you've got things 
covered is annoying, but also, because once you have a default clause, 
you've given up on getting any better type checking for totality.  In a 
switch over enum X {A, B}, having only cases for A and B means that, 
when someone adds C later, you'll find out about it, rather than 
sweeping it under the rug.  Sealed class hierarchies have the same 
issues as enums; the possibility of novel values due to separate 
compilation.  So far, all of these could be described by ICCE (and they 
are, currently.)


We've already talked for several lifetimes about null; switches that 
reject null do so with NPE.  That also makes sense.  We had hoped that 
this covered the weird values that might leak out of 
otherwise-exhaustive switches, but that was wishful thinking.


Having nested deconstruction patterns introduces an additional layer of 
weirdness.  Suppose we have


    sealed interface A permits X, Y { }
    Box box;

    switch (box) {
    case Box(X x):
    case Box(Y y):
    }

This should be exhaustive, but we have to deal with two additional bad 
values: Box(null), which is neither a Box(A) or a Box(B), and Box(C), 
for a novel subtype C.  We don't want to disturb the user to deal with 
these by making them have a default clause.


So we define exhaustiveness separately from totality, and remainder is 
the difference.  (We can constructively characterize the upper bound on 
remainder.)  And we can introduce a throwing default, as we did with 
expression switches over enums.  But what should it throw?


The obvious but naive answer is "well, Box(null) should throw NPE, and 
Box(C) should throw ICCE."  But only a few minutes thinking shows this 
to be misleading, expensive, and arbitrary.  When we encountered 
Box(null), it was not because anyone tried to dereference a null, so 
throwing NPE is misleading.  If the shape of the remainder is 
complicated, this means generating tons of low-value, compiler-generated 
boilerplate to differentiate Box(Bag(null)) from 
Box(Container()).  That's expensive.  And, what about Foo(null, 
C)?  Then we have to arbitrarily pick one.  It's a silly scheme.


So the logical thing to do is to say that these things fall into a 
different category from NPE and ICCE, which is that they are remainder, 
which gets its own label.


In any case, I am not getting your point about "but people can
catch it."  So what?  People can catch OOME too, and try to parse
the output of toString() when we tell them not to. But that's no
reason to make all exceptions "OpaqueError". So what is your point
here? 



You can catch OOME if you write the code by hand. People are using 
IDEs and when the IDE is lost or the user have click on the wrong 
button, catch(Exception) appears.
That the reason why we have both IOError and UncheckedIOException in 
the JDK.


I'm still not getting your point.



Some patterns are considered exhaustive, but not total.  A
deconstruction pattern D(E(total)) is one such example; it is
exhaustive on D, but does not match D(null), because matching the
nested E(total) requires invoking a deconstructor in E, and you
can't invoke an instance member on a null receiver.  Still, we
consider D(E(total)) exhaustive on D, which means it is enough
to satisfy the type checker that you've covered everything.
Remainder is just the gap between exhaustiveness and totality. 



The gap is due to E(...) not matching null, for me it's a NPE with an 
error message saying exactly that.


See above -- this is (a) NOT about dereferencing a null; it's about a 
value outside the set of match values, (b) the scheme involving NPE does 
not scale, and (c) will eventually force us to silly arbitrary choices.


What you are saying is that at runtime you need to know if a pattern 
is total or not, exactly you need to know if was decided to be total 
at compile, so at runtime you can decide to throw a NPE or not.
Furthermore, if at runtime you detect that the total pattern is not 
total anymore, a ICCE should be raised.

Re: [External] : Re: Record patterns (and beyond): exceptions

2022-02-17 Thread Brian Goetz





As we look ahead to record patterns, there is a new kind of
remainder: the "spine" of nested record patterns.  This includes
things like Box(null), Box(novel), Box(Bag(null)),
Box(Mapping(null, novel)), etc.  It should be clear that there is
no clean extrapolation from what we currently do, to what we
should do in these cases.  But that's OK; both of the existing
remainder-rejection cases derive from "what does the context
think" -- switch hates null (so, NPE), and enum switches are a
thing (which makes ICCE on an enum switch reasonable.) But in the
general case, we'll want some sort of MatchRemainderException. 



Nope, it can not be a runtime exception because people will write code 
to catch it and we will have a boat load of subtle bugs because 
exception are side effect so you can see in which order the 
de-constructors or the pattern methods are called. ICCE is fine.


But this clearly does not fall into ICCE.  ICCE means, basically, "your 
classpath is borked"; that things that were known to be true at compile 
time are not true at runtime.  (Inconsistent separate compilation is the 
most common cause.)  But Box(Bag(null)) is not an artifact of 
inconsistent separate compilation.


In any case, I am not getting your point about "but people can catch 
it."  So what?  People can catch OOME too, and try to parse the output 
of toString() when we tell them not to.  But that's no reason to make 
all exceptions "OpaqueError".  So what is your point here?





Note that throwing an exception from remainder is delayed until
the last possible moment.  We could have:

    case Box(Bag(var x)): ...
    case Box(var x) when x == null: ...

and the reasonable treatment is to treat Box(Bag(var x)) as not
matching Box(null), even though it is exhuastive on Box>),
and therefore fall into the second case on Box(null). Only when we
reach the end of the switch, and we haven't matched any cases, do
we throw MatchRemainderException. 



I really dislike that idea, it will be a burden in the future each 
time we want to change the implementation.
I would like the semantics to make no promise about when the error 
will be thrown, the semantics should not be defined if a 
deconstructors/pattern method does a side effect, the same way the 
stream semantics is not defined if the lambda taken as parameter of 
Stream.map() does a side effect.
I think the parallel between the pattern matching and a stream in term 
of execution semantics is important here. From the outside, those 
things are monads, they should work the same way.





I think this stems from the same misunderstanding you have about the 
boundary between the pattern semantics and the construct semantics. I'm 
going to test-drive some adjusted language here.


A total pattern is just that -- it matches everything.

Some patterns are considered exhaustive, but not total.  A 
deconstruction pattern D(E(total)) is one such example; it is exhaustive 
on D, but does not match D(null), because matching the nested E(total) 
requires invoking a deconstructor in E, and you can't invoke an instance 
member on a null receiver.  Still, we consider D(E(total)) exhaustive on 
D, which means it is enough to satisfy the type checker that you've 
covered everything. Remainder is just the gap between exhaustiveness and 
totality.


If we have the following switch:

    case D(E(Object o)):
    caes D(var x) when x == null:

the semantics of D(E(Object)) are that *it matches all non-null D, 
except D(null)*.  So throwing when we evaluate the case would be 
incorrect; switch asks the pattern "do you match", and the pattern says 
"no, I do not."  And the semantics of switch, then, say "then I will 
keep trying the rest of the cases."


So *when* the error is thrown derives from the semantics of the 
construct; switch tries matching with each pattern, until it finds a 
match or runs out of patterns.  When it runs out of patterns is when it 
needs to insert a catch-all to deal with remainder (as we do with enum 
switch expressions.)

Record patterns (and beyond): exceptions

2022-02-16 Thread Brian Goetz

As we move towards the next deliverable -- record patterns -- we have 
two new questions regarding exceptions to answer.


 Questions

1.  When a dtor throws an exception.  (You might think we can kick this 
down the road, since records automatically acquire a synthetic dtor, and 
users can't write dtors yet, but since that synthetic dtor will invoke 
record component accessors, and users can override record component 
accessors and therefore they can throw, we actually need to deal with 
this now.)


This has two sub-questions:

1a.  Do we want to do any extra type checking on the bodies of dtors / 
record accessors to discourage explicitly throwing exceptions?  
Obviously we cannot prevent exceptions like NPEs arising out of 
dereference, but we could warn on an explicit throw statement in a 
record accessor / dtor declaration, to remind users that throwing from 
dtors is not the droid they are looking for.


1b.  When the dtor for Foo in the switch statement below throws E:

    switch (x) {
    case Box(Foo(var a)): ...
    case Box(Bar(var b)): ...
    }

what should happen?  Candidates include:

 - allow the switch to complete abruptly by throwing E?
 - same, but wrap E in some sort of ExceptionInMatcherException?
 - ignore the exception and treat the match as having failed, and move 
on to the next case?


2.  Exceptions for remainder.  We've established that there is a 
difference between an _exhaustive_ set of patterns (one good enough to 
satisfy the compiler that the switch is complete enough) and a _total_ 
set of patterns (one that actually covers all input values.)  The 
difference is called the _remainder_.  For constructs that require 
totality, such as pattern switches and let/bind, we have invariants 
about what will have happened if the construct completes normally; for 
switches, this means exactly one of the cases was selected.  The 
compiler must make up the difference by inserting a throwing catch-all, 
as we already do with expression switches over enums, and all switches 
over sealed types, that lack total/default clauses.


So far, remainder is restricted to two kinds of values: null (about 
which switch already has a strong opinion) and "novel" enum constants / 
novel subtypes of sealed types. For the former, we throw NPE; for the 
latter, we throw ICCE.


As we look ahead to record patterns, there is a new kind of remainder: 
the "spine" of nested record patterns.  This includes things like 
Box(null), Box(novel), Box(Bag(null)), Box(Mapping(null, novel)), etc.  
It should be clear that there is no clean extrapolation from what we 
currently do, to what we should do in these cases.  But that's OK; both 
of the existing remainder-rejection cases derive from "what does the 
context think" -- switch hates null (so, NPE), and enum switches are a 
thing (which makes ICCE on an enum switch reasonable.)  But in the 
general case, we'll want some sort of MatchRemainderException.


Note that throwing an exception from remainder is delayed until the last 
possible moment.  We could have:


    case Box(Bag(var x)): ...
    case Box(var x) when x == null: ...

and the reasonable treatment is to treat Box(Bag(var x)) as not matching 
Box(null), even though it is exhuastive on Box>), and therefore 
fall into the second case on Box(null).  Only when we reach the end of 
the switch, and we haven't matched any cases, do we throw 
MatchRemainderException.


 Discussion

For (1a), my inclination is to do nothing for record accessors, but when 
we get to explicit dtors, warning on explicit throw is not a bad idea.  
Unlike ctors, where exceptions are part of the standard toolbox, 
deconstructors are handed an already-constructed object, and are 
supposed to be total.  If you're inclined to write a partial dtor, 
you're probably doing it wrong.  As it is a new construct, the 
additional error checking to guide people to its proper use is probably 
reasonable (and cheap.)


For (1b), since an exception in a dtor is suppose to indicate an 
exceptional failure, I don't think swallowing it and trying to go on 
with the show is a good move.  My preference would be to wrap the 
exception, as we do with ExceptionInInitializerError, to make it clear 
that an exception from a dtor is a truly unexpected thing, and clearly 
name-and-shame the offending dtor.  So there's a bikeshed to paint for 
what we call this exception.


For (2), trying to repurpose either NPE or ICCE here is a losing move.  
Better to invent an exception type that means "uncovered remainder" 
(which is more akin to an IAE than anything else; someone passed a bad 
value to an exhaustive-enough switch.)  We would use the same exception 
in let/bind, so this shouldn't have "switch" in its name, but probably 
something more like MatchRemainderException.

Re: [External] : Re: Reviewing feedback on patterns in switch

2022-02-16 Thread Brian Goetz

Of course, in an ecosystem as diverse as Java developers, one routinely 
expects to get complaints about both X and ~X.  Which makes it notable 
that we have not gotten any complaints about "why do you force me to 
write an empty default".  (I'm not complaining!)


The case you raise -- legacy { switch type, labels, statement } switches 
-- is harder to fix.  The things we've explored (like an opt-in to 
totality) are pretty poor fixes, since (a) they are noisy warts, and (b) 
people will forget them and still have the problem.  So these are 
harder, longer-term problems.  (For now, the best we can do is noisy 
warnings.)


On 2/16/2022 11:00 AM, Remi Forax wrote:





    *From: *"Brian Goetz" 
*To: *"amber-spec-experts" 
*Sent: *Wednesday, February 16, 2022 4:49:19 PM
*Subject: *Re: Reviewing feedback on patterns in switch

One thing that we have, perhaps surprisingly, *not* gotten
feedback on is forcing all non-legacy switches (legacy type,
legacy labels, statement only) to be exhaustive.  I would have
thought people would complain about pattern switches needing to be
exhaustive, but no one has! So either no one has tried it, or we
got away with it...


Yes, we had several feedbacks about the opposite, why the switch 
statement on an enum is not exhaustive, i.e. why the following code 
does not compile


enum Color {RED,BLUE }
int x;
Color color =null;
switch (color) {
   case RED -> x =0;
   case BLUE -> x =1;
}
System.out.println(x);  // x may not be initialized
Rémi



    On 1/25/2022 2:46 PM, Brian Goetz wrote:

We’ve previewed patterns in switch for two rounds, and have received 
some feedback.  Overall, things work quite well, but there were a few items 
which received some nontrivial feedback, and I’m prepared to suggest some 
changes based on them.  I’ll summarize them here and create a new thread for 
each with a more detailed description.

I’ll make a call for additional items a little later; for now, let’s 
focus on these items before adding new things (or reopening old ones.)

1.  Treatment of total patterns in switch / instanceof

2.  Positioning of guards

3.  Type refinements for GADTs

4.  Diamond for type patterns (and record patterns)

Re: Reviewing feedback on patterns in switch

2022-02-16 Thread Brian Goetz

One thing that we have, perhaps surprisingly, *not* gotten feedback on 
is forcing all non-legacy switches (legacy type, legacy labels, 
statement only) to be exhaustive.  I would have thought people would 
complain about pattern switches needing to be exhaustive, but no one 
has! So either no one has tried it, or we got away with it...


On 1/25/2022 2:46 PM, Brian Goetz wrote:

We’ve previewed patterns in switch for two rounds, and have received some 
feedback.  Overall, things work quite well, but there were a few items which 
received some nontrivial feedback, and I’m prepared to suggest some changes 
based on them.  I’ll summarize them here and create a new thread for each with 
a more detailed description.

I’ll make a call for additional items a little later; for now, let’s focus on 
these items before adding new things (or reopening old ones.)

1.  Treatment of total patterns in switch / instanceof

2.  Positioning of guards

3.  Type refinements for GADTs

4.  Diamond for type patterns (and record patterns)

Re: Reviewing feedback on patterns in switch

2022-02-16 Thread Brian Goetz



For me, && is more natural than "when" because i've written more 
switch that uses && than "when".
And don't forget that unlike most of the code, with pattern matching 
the number of characters does matter, this is more similar to 
lambdas, if what you write is too verbose, you will not write it.


At the risk of premature bikeshedding, have we already discussed and 
discarded the idea of spelling “when” as “if”? It’s been a long time, 
and I forget.


There was not extensive discussion on this, and its all very 
subjective/handwavy/"what we think people would think", but I remember a 
few comments on this:


 - The generality of "if" reminded people of the Perl-style "statement 
unless condition" postfix convention, and that people might see it as an 
"inconsistency" that they could not then say


   x = 3 if (condition);

which is definitely somewhere we don't want to go.


 - We're use to seeing "if" with a consequence, and a "naked" if might 
have the effect of "lookahead pollution" in our mental parsers.



 - Keeping `if` for statements allows us to keep the "body" of case 
clauses visually distinct from the "envelope":


    case Foo(var x)
    if (x > 3) : if (x > 10) { ... }

would make people's eyes go buggy.  One could argue that "when" is not 
fantastically better:


    case Foo(var x)
    when (x > 3) : if (x > 10) { ... }

but it doesn't take quite as long to de-bug oneself in that case.

On 2/15/2022 9:55 PM, Guy Steele wrote:

Re: [External] : Re: Reviewing feedback on patterns in switch

2022-02-16 Thread Brian Goetz

OK, I'll make you a deal: I'll answer your question about let/bind, 
under the condition that we not divert the discussion on that right now 
-- there'll be a proper writeup soon.  The answer here is entirely for 
context.


If you don't agree, stop reading now :)

On 2/15/2022 5:58 PM, Remi Forax wrote:


 - There are future constructs that may take patterns, and may (or
may not) want to express guard-like behavior, such as `let`
statements (e.g., let .. when .. else.)  Expressing guards here
with && is even less evocative of "guard condition" than it is
with switches.


It's not clear to me how to use "let when else". Is it more like a ?: 
in C than the let in in Caml ?


The simplest form of `let` is a statement that takes a total pattern:

    let Point(var x, var y) = aPoint;

and introduces bindings x and y into the remainder of the block. When 
applicable, this is better than a conditional context because (a) you 
get type checking for totality, and (b) you don't indent the rest of 
your method inside a test that you know will always succeed.


If the pattern is total but has some remainder, the construct must throw 
on the remainder, to preserve the invariant that when a `let` statement 
completes normally, all bindings are DA.


What if I want to use a partial pattern, and then customize either the 
throwing part or provide default values?   I can provide an else clause:


    Object o = ...
    let String s = o
    else throw new NotStringException();

or

    Object o = ...
    let String s = o
    else { s = "no string"; }

These are two ways to preserve the "DA on normal completion" invariant; 
either by not completing normally, or by ensuring the bindings are DA.


Now, we are in a situation where we are with switch: patterns do not 
express all possible conditions.  Which is why we introduced guards to 
switches.  And we can use the same trick here:


    Object o = ...
    let String s = o
    when (!s.isEmpty())
    else { s = "no string"; }

If we tried to use && here, it would look like

    Object o = ...
    let String s = o && (!s.isEmpty())
    else { s = "no string"; }

which has the same problem as `case false && false`.

Reminder: THIS EXPLANATION WAS PROVIDED SOLELY TO CLARIFY THE "FUTURE 
CONSTRUCT" COMMENT IN THE && DISCUSSION.

Re: [External] : Re: Reviewing feedback on patterns in switch

2022-02-16 Thread Brian Goetz




Not sure it's a no-brainer.
The question is more a question of consistency. There are two 
consistencies and we have to choose one, either switch never allows 
null by default and users have to opt-in with case null or we want 
patterns to behave the same way if they are declared at top-level or 
if they are nested. I would say that the semantics you propose is more 
like the current Java and the other semantics is more like the Java of 
a future (if we choose the second option).


You are right that any justification involving "for consistency" is 
mostly a self-justification.  But here's where I think this is a cleaner 
decomposition.


We define the semantics of the patterns in a vacuum; matching is a 
three-place predicate involving a static target type, a target 
expression, and a pattern.  Null is not special here.  (This is how 
we've done this all along.)


Pattern contexts (instanceof, switch, and  in the future, nested 
patterns, let/bind, catch, etc) on the other hand, may have pre-existing 
(and in some cases reasonable) opinions about null. What's new here is 
to fully separate the construct opinions about special values from the 
pattern semantics -- the construct makes its decision about the special 
values, before consulting the pattern.


This lets instanceof treat null as valid but say "null is not an 
instance of anything", past-switch treats null as always an error, and 
future-switch treats null as a value you can opt into matching with the 
`null` label.  (Yes, this is clunky; if we had non-nullable type 
patterns, we'd get there more directly.)


But the part that I think is more or less obvious-in-hindsight is that 
the switch opinions are switches opinions, and the pattern opinions are 
pattern opinions, and there is a well-defined order in which those 
opinions are acted on -- the construct mediates between the target and 
the patterns.  That is, we compose the result from the construct 
semantics and-then the pattern semantics.


None of this is really all that much about "how do people like it". But 
what I do think people will like is that they get a simple rule out of 
switches: "switches throw on null unless the letters n-u-l-l appear in 
the switch body".  And a simple rule for instanceof: "instanceof never 
evaluates to true on null".  And that these rules are *independent of 
patterns*.  So switch and instanceof can be understood separately from 
patterns.

Re: Reviewing feedback on patterns in switch

2022-02-15 Thread Brian Goetz

We're preparing a third preview of type patterns in switch.  Normally we 
would release after a second preview, but (a) we're about to get record 
patterns, which may disclose additional issues with switch, so best to 
keep it open for at least another round, and (b) we're proposing some 
nontrivial changes which deserve another preview.


Here's where we are on these.


1.  Treatment of total patterns in switch / instanceof


Quite honestly, in hindsight, I don't know why we didn't see this 
sooner; the incremental evolution proposed here is more principled than 
where we were in the previous round; now the construct (instanceof, 
switch, etc) *always* gets first crack at enforcing its nullity (and 
exception) opinions, and *then* delegates to the matching semantics of 
the pattern if it decides to do so.  This fully separates pattern 
semantics from conditional construct semantics, rather than complecting 
them (which in turn deprived users of seeing the model more clearly.)  
In hindsight, this is a no-brainer (which is why we preview things.)  
We'll be addressing this in the 3rd preview.



2.  Positioning of guards


Making guards part of switch also feels like a better factoring than 
making them part of patterns; it simplifies patterns and totality, and 
puts switch on a more equal footing with our other conditional 
constructs.  We did go back and forth a few times on this, but having 
given this a few weeks to settle, I'm pretty convinced we'd regret going 
the other way.


There were two sub-points here: (a) is the guard part of the pattern or 
part of switch, and (b) the syntax.  There was general agreement on (a), 
but some had preference for && on (b).  I spent some more time thinking 
about this choice, and have come down firmly on the `when` side of the 
house as a result for a number of reasons.


 - Possibility for ambiguity.  If switching over booleans (which we 
will surely eventually be forced into), locutions like `case false && 
false` will be very confusing; it's pure puzzler territory.
 - && has a stronger precedence than keyword-based operators like 
`instanceof`'; we want guards to be weakest here.
 - Using && will confuse users about whether it is part of the 
expression, or part of the switch statement.  If we're deciding it is 
part of the switch, this should be clear, and a `when` clause makes that 
clear.
 - There are future constructs that may take patterns, and may (or may 
not) want to express guard-like behavior, such as `let` statements 
(e.g., let .. when .. else.)  Expressing guards here with && is even 
less evocative of "guard condition" than it is with switches.

 - Users coming from other languages will find `case...when` quite clear.
 - We've talked about "targetless" switches as a possible future 
feature, which express multi-way conditionals:


    switch {
    case when (today() == TUESDAY): ...
    case when (location() == GREENLAND): ...
    ...
    }

This would look quite silly with &&.  Similarly, one could mix guards 
with a targeted switch:


    switch (x) {
    case Time t: ...
    case Place p: ...
    default when (today() == TUESDAY): ... tuesday-specific default
    default: ... regular default ...

Expressing guards that are the whole condition with `when` is much more 
natural than with &&.


tl;dr: inventing a `when` modifier on switch now will save us from 
having to invent something else in the future; choosing && will not.


We can continue to discuss the bikeshed at low volume (at least until we 
start repeating ourselves), but we need to address both of these points 
in the 3rd preview.



3.  Type refinements for GADTs


I've been working through the details here, and there are a number of 
additional touch points where GADTs can provide type refinement, not 
just on the RHS of a case, such as totality and inference.  I'll be 
pulling all these together to try to get a total picture here. It's not 
a blocker for the 3rd preview, it can be a future refinement.



4.  Diamond for type patterns (and record patterns)
This seems desirable, but there are details to work out.  It's not a 
blocker for the 3rd preview, it can be a future refinement.

1 2 3 4 5 6 7 8 9 >

1 - 100 of 848 matches

Mail list logo