Re: Array patterns (and varargs patterns)

Brian Goetz Fri, 09 Sep 2022 11:29:55 -0700

Again, look for the embedding projection pairs. The sets involved areT^n and T[]. The array creation operator is an embedding from T^n toT[]; the missing dual is the projection from T[] to T^k (for specifick.) Projections are partial (or lossy), so these are patterns ratherthan total functions. The dual of packing an array from a list ofexpressions is unpacking the elements into a list of variables.


When I pack an array:


    String[] ss = new String[] { "Hi", "Bob" };

this has a similar feel to

    Object o = "Bob";

in that we've thrown away some static typing information (in the former,that the array has length two.) But this information is retaineddynamically, and we can recover it with a runtime test. Asking


    if (o instanceof String s) { ... }

is asking "was the last assignment to `o` from a String".  Asking

    if (ss instanceof String[] { var a, var b }) { ... }

is asking "was the last assignment to ss a String[] with two elements"(and similar for other configurations of the nested patterns.) In bothcases, we are asking the same generalized question: could this { object,array } have come from an assignment / creation expression that has acertain shape.

I get it; you don't find this feature compelling. You've said thatalready, and now we're just going in circles. Your mail reads to melike "its a bad idea because I think its a bad idea." Yes, otherlanguages approach this in different ways; Caml deconstructs into (head,tail) because its fundamental data structure is a cons list. That makessense given how the language works. Java works differently, sotransplanting from Caml or Javascript is not always going to be a goodanswer. Remember the pattern mantra: each aggregation idiom in thelanguage should have a corresponding form deconstruction pattern. Constructors have deconstruction patterns; factory methods willeventually have named static patterns; if we add collection literals,there will be collection patterns, etc. If an aggregation form lacks acorresponding dual, this turns into an asymmetry which in turn means*destructuring cannot compose the same way aggregation composes*. Thisis bad! Arrays have their own special form of aggregation (arraycreation expression); array patterns are the corresponding destructuring.

I encourage you to re-readhttps://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model, and the "red ball" API examples, to see what I mean. This is aboutcomposibility, not about whether any given form of pattern "pays itsweight."

So again, please try harder to engage with _why do we think this isimportant_, and the specifics of what has been proposed, rather thanjust waving the YAGNI stick. There's a bigger picture here.

For me, Arrays.of() is a named pattern with a vararg list of bindings, no ?

Its a named pattern, but to work, it would need varargs patterns -- and
array patterns are the underpinnings of varargs, just as array creation
is the underpinning of varargs invocation.  We're not going to do
varargs patterns differently than we do varargs invocation, just to
avoid doing array patterns -- that would be silly.

Here we want to extract the value into bindings/variables, that is not what the 
varargs does, the varargs  takes a bunch of value on stack and put them into an 
array.
Here we want the opposite operation of a varargs, the spread (or splat) 
operator that takes the argument from an array (or a collection ?) and put them 
on the stack.

If we have the pattern method Arrays.of()

static <T> pattern (T...) of(T[] array) {  // here it's a varargs
   ...
}

and we call it using a named pattern
   switch(array) {
     case Arrays.of(/* insert a syntax here */) -> ...

the syntax should extract some/all values of the array into one or several 
bindings.

If we are in Caml, we have the :: operator to separate the first element from 
the rest
   switch(array) {
     case Arrays.of(String first :: String[] rest) -> ...

If we are in JavaScript, we have the spread operator (notice that the ... is 
before the type)
   switch(array) {
     case Arrays.of(String first, ... String[] rest) -> ...

So the varargs is at the declaration side, at the pattern side we need a new 
operator spread, so i think that adding an array pattern now is not a good idea.

regards,
Rémi

With best regards,
Tagir Valeev.

On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <[email protected]> wrote:

We dropped this out of the record patterns JEP, but I think it is time to
revisit this.

The concept of array patterns was pretty straightforward; they mimic the nesting
and exhaustiveness rules of record patterns, they are just a different sort of
container for nested patterns.  And they have an obvious duality with array
creation expressions.

The main open question here was how we distinguish between "match an array of
length exactly N" (where there are N nested patterns) and "match an array of
length at least N".  We toyed with the idea of a "..." indicator to mean "more
elements", but this felt a little forced and opened new questions.

It later occurred to me that there is another place to nest a pattern in an
array pattern -- to match (and bind) the length.  In the following, assume for
sake of exposition that "_" is the "any" pattern (matches everything, binds
nothing) and that we have some way to denote a constant pattern, which I'll
denote here with a constant literal.

There is an obvious place to put this (optional) pattern: in between the
brackets.  So:

       case String[1] { P }:
                   ^ a constant pattern

would match string arrays of length 1 whose sole element matches P.  And

       case String[] { P, Q }

would match string arrays of length exactly 2, whose first two elements match P
and Q respectively.  (If the length pattern is not specified, we infer a
constant pattern whose constant is equal to the length of the nested pattern
list.)

Matching a target to `String[L] { P0, .., Pn }` means

       x instanceof String[] arr
           && arr.length matches L
           && arr.length >= n
           && arr[0] matches P0
           && arr[1] matches P1
           ...
           && arr[n] matches Pn

More examples:

       case String[int len] { P }

would match string arrays of length >= 1 whose first element matches P, and
further binds the array length to `len`.

       case String[_] { P, Q }

would match string arrays of any length whose first two elements match P and Q.

       case String[3] { }
                   ^constant pattern

matches all string arrays of length 3.


This is a more principled way to do it, because the length is a part of the
array and deserves a chance to match via nested patterns, just as with the
elements, and it avoid trying to give "..." a new meaning.

The downside is that it might be confusing at first (though people will learn
quickly enough) how to distinguish between an exact match and a prefix match.




On 1/5/2021 1:48 PM, Brian Goetz wrote:

As we get into the next round of pattern matching, I'd like to opportunistically
attach another sub-feature: array patterns.  (This also bears on the question
of "how would varargs patterns work", which I'll address below, though they
might come later.)

## Array Patterns

If we want to create a new array, we do so with an array construction
expression:

       new String[] { "a", "b" }

Since each form of aggregation should have its dual in destructuring, the
natural way to represent an array pattern (h/t to AlanM for suggesting this)
is:

       if (arr instanceof String[] { var a, var b }) { ... }

Here, the applicability test is: "are you an instanceof of String[], with length
= 2", and if so, we cast to String[], extract the two elements, and match them
to the nested patterns `var a` and `var b`.   This is the natural analogue of
deconstruction patterns for arrays, complete with nesting.

Since an array can have more elements, we likely need a way to say "length >= 2"
rather than simply "length == 2".  There are multiple syntactic ways to get
there, for now I'm going to write

       if (arr instanceof String[] { var a, var b, ... })

to indicate "more".  The "..." matches zero or more elements and binds nothing.

<digression>
People are immediately going to ask "can I bind something to the remainder"; I
think this is mostly an "attractive distraction", and would prefer to not have
this dominate the discussion.
</digression>

Here's an example from the JDK that could use this effectively:

String[] limits = limitString.split(":");
try {
       switch (limits.length) {
           case 2: {
               if (!limits[1].equals("*"))
                   setMultilineLimit(MultilineLimit.DEPTH, 
Integer.parseInt(limits[1]));
           }
           case 1: {
               if (!limits[0].equals("*"))
                   setMultilineLimit(MultilineLimit.LENGTH, 
Integer.parseInt(limits[0]));
           }
       }
}
catch(NumberFormatException ex) {
       setMultilineLimit(MultilineLimit.DEPTH, -1);
       setMultilineLimit(MultilineLimit.LENGTH, -1);
}

becomes (eventually)

       switch (limitString.split(":")) {
           case String[] { var _, Integer.parseInt(var i) } -> 
setMultilineLimit(DEPTH, i);
           case String[] { Integer.parseInt(var i) } -> 
setMultilineLimit(LENGTH, i);
           default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, 
-1); }
       }

Note how not only does this become more compact, but the unchecked
"NumberFormatException" is folded into the match, rather than being a separate
concern.


## Varargs patterns

Having array patterns offers us a natural way to interpret deconstruction
patterns for varargs records.  Assume we have:

       void m(X... xs) { }

Then a varargs invocation

       m(a, b, c)

is really sugar for

       m(new X[] { a, b, c })

So the dual of a varargs invocation, a varargs match, is really a match to an
array pattern.  So for a record

       record R(X... xs) { }

a varargs match:

       case R(var a, var b, var c):

is really sugar for an array match:

       case R(X[] { var a, var b, var c }):

And similarly, we can use our "more arity" indicator:

       case R(var a, var b, var c, ...):

to indicate that there are at least three elements.

Re: Array patterns (and varargs patterns)

Reply via email to