Unnamed variables and match-all patterns

Brian Goetz Wed, 07 Sep 2022 10:41:51 -0700

We've gone around and around a few times on "unnamed variables"(underscore), starting with JEP 302 (Lambda Leftovers). We reclaimedthe underscore token in Java 9 with the intention of using it forunnamed variables and "any" patterns. Along the way, we ran into somehiccups, and it has sat on the shelf for a while. Let's take it down,dust it off, and see if we have any more clarity than before.

There are three syntactic productions in which we might want to useunderscore as a "don't care" indicator:

- Unnamed variables. Here, underscore stands in for a variable name. When we declare a local variable, catch formal, pattern variable, etc,whose name is `_`, which has the effect of entering no new names inscope. It becomes an "initialize-only" variable.


    try { ... }
    catch (FooException _) { throw new BarException("foo"); }

- Partial inference. Here, underscore stands in for a type name. Today, we can infer type variables for generic method invocations andconstructor invocations, but it is all-or-nothing. Being able to denote"infer this type" would allow us to do partial inference:


    foo.<String, _>m(...)

- "Any" patterns. Here, underscore is a pattern, which matcheseverything, and binds nothing.


    case Foo(var s, _): ...

We don't have to do all of these; right now we're not consideringpartial inference, but the other two are reasonable options. Unnamedvariables have been a long-standing request; any patterns will likely bea common request soon as well.

For a match-all pattern, there is little to say other than "_" is one ofthe alternatives of the Pattern production, it is applicable to alltypes, it is unconditional on all types, and it has no bindings. Thespecification already has a concept of "any" patterns; this is justmaking it denotable.

I think there is little controversy about using unnamed local variables(local variable declaration statements, catch formals, foreach inductionvariables, resources in try-with-resources) and unnamed lambdaparameters. What is common to all of these is that these are _pureimplementation details_, where the author has elected to not give a nameto a variable that is entirely implementation-facing. This seemseminently reasonable. Unnamed parameters can help eliminate errors bycapturing design assumptions and make life easier for static analysistools that like to point out unused variables.

Where we stumble is on method parameters, because method parameter namesserve two masters -- the implementation (as the declaration of avariable) and the API (as part of the specification of what the methoddoes.) Among other things, we like to document the semantics of methodparameters in Javadoc with the `@param` tag, but doing so requires aname (or inventing a new Javadoc mechanism like `@param #4`, likely aloser.) Secondarily, sometimes parameter names are retained in theMethodParameters attribute, though that attribute (JVMS 4.7.24) alreadysupports parameters without names by using a zero CP index.

With `var`, we drew a clear line of "implementation only" -- you can'tinfer a method return type, even for a private method, you can only useit for local variables and lambda formals. This has been prettysuccessful.

We've explored a number of intermediate points on the spectrum withvarying degrees of stability:

A) Implementation only -- local variables, catch formals, for-loopinduction variables, TWR resources, pattern variables, lambda formals

 B) "A++", where we add in method parameters of anonymous classes

C) Adding in method parameters _for non-initial declarations_ -- allowunnamed parameters only for methods that override a method from asupertype, ensuring that there is a real specification of what theparameters mean. D) Anything goes, any method parameter can be unnamed, throwingspecification to the wind.

A is a stable point, and has the advantage of mostly lining up withwhere we can use `var`. But users will surely grumble that they can'tuse it for implementations of methods from supertypes. As this featurerequest predates lambdas and patterns, giving it to lambdas and patternsbut not ordinary methods might feel a bit mean.

The motivation for B is obvious -- to support smooth refactoring betweenlambdas and inner classes -- but is not a very stable point, as one willimmediately ask "what about refactoring to named classes".

C feels attractive, though there would surely be complaints too; itexcludes constructors and static methods (which might sometimes wantunnamed parameters when a parameter is no longer used, but stays aroundfor binary compatibility), and even some initial declarations. But,these cases are likely to be somewhat more rare, so I don't object toleaving these aside. The main concern is that this might feelarbitrary. There is also the possibility for some confusion; it is notobvious what it means when you override a method that already has anunnamed parameter. Can you give it a name and use it? It is a littleweird that the lack of name applies only to the implementation of themethod, but somehow bleeds into the specification. There is also someimpact on Javadoc, as well as lingering concerns that there are othershoes to drop other than Javadoc and MethodParameters.

D is also stable, but feels like it makes the language less safe, bymaking some methods unspecifiable. On the other hand, the people whomight use it for initial declarations, static methods, etc, are also thesort of people who probably don't write specification anyway (otherwisethey would realize that they are depriving their callers of usefulinformation.)

In (C), Javadoc could insert an `@implNote` that says something like"this implementation ignores the value of parameters <x> and <y> fromdeclaring method Foo::bar". In (D), it could say "ignores its 3rd and4th parameter", or insert synthetic @param tags for parameters whosename is something like "<unnamed>".

Past discussions seemed to gravitate toward either A or D, which arealso the simplest / most stable points. I guess it becomes a questionof getting over the "makes the language less safe" concerns.

Regardless, I'd like to see if we can quantify the "lingering concernsabout other shoes to drop."

Unnamed variables and match-all patterns

Reply via email to