Applicability check
> When loading a parameterization of a generic class, we perform an applicability check for each member as we
> encounter it; in the model outlined here, this is a straight subtyping check of the current parameterization against
> the restriction domain.
In order to support the subtyping check, the applicability check should happen in the specializer, and not when loading the specialization. Both the type information and the class hierarchy are more easily accessible at that point.
> If there are duplicate applicable members in a classfile where neither's restriction domain is more specific than the
> other's, then the VM is permitted to make an arbitrary choice.
Seconding Karen's comment, we'd like to avoid "arbitrary" choices, as both users and JVM implementers need to know how to get consistent behaviour. Unspecified behaviour may change, and it may also have corner cases that are treated differently by different JVM implementations.
Would it be better to reject a specialization where there are multiple maximally specific applicable members, or to reject templates that would allow such scenarios?
Reflection
We need to specify what the reflection behaviour will be for conditional members, as it may depend on how each JVM implementation decides to represent species internally. The current reflection behaviour is not well specified, and adding conditional members may add more inconsistencies.
JVMTI / class redefinition / class retransformation
This applies conditional members specifically, and also to specializations in general.
What happens when a generic class is redefined? Will the whole specialization nest require redefinition, or will the redefinition be limited to redefined specialization? What about changes to a generic class (template)? What happens if the restriction domain of a conditional members changes?
Will only non-conditional methods be in the any-interface? Or will conditional methods have a default implementation (e.g. throw UnsupportedOperationException)?
Motivation
I think the API migration concern is compelling. But to handle that, it's sufficient to be able to restrict members to the all-erased specialization (or else require them to be total). This mechanism could be very simple, and the resulting API differences seem to be well justified by the compatibility requirements.
More general conditional inclusion starts to get complex pretty quickly. We're already talking about things like lattices of tuples of specialized type arguments, or logical operators on the restrictions ("where not", and implicit conjunctions). This isn't just implementation complexity for the VM. Whatever is introduced here is something users will need to reason about if it affects observable behaviour, or if it affects the interface that a type exposes.
In general I like the idea of a facility that allows for method implementations to be specialized for known types. It can help to get performance in cases where otherwise some abstraction would get in the way by forcing us to treat things uniformly. And the spirit of such specialization is that it should be (at least mostly) transparent, so users shouldn't usually need to think about how the implementation is selected in this case.
However, at the Java language level, conditional members have a significant limitation here. Erasure means that it's only possible to specialize for primitive types. There's no way to specialize for String, for example.
Then there is type-specific functionality such as List<int>.sum(). This doesn't strike me as something that belongs in List, any more than these do:
- List<String>.append()
- List<List<T>>.append()
- List<UnaryOpeartor<T>>.compose()
But due to erasure, these wouldn't be expressible. This kind of API extension is limited to primitive types. (Later it could be done for value types more generally, but I don't think it would be good to allow users to special-case their own APIs for user-defined value types, but not for T=String.)
I think the usual benefit gained by adding a method to an interface (when it could be implemented outside) is that implementors can specialize it for each type of List. But I'm not sure I see that here.
We would get the fluent style of call "(...).sum()", but I don't think adding methods to List is the right way to get that, especially if it will only work for primitive types, and if it means that users need to think about sometimes methods of List more often than necessary.
----- Original message -----
From: Brian Goetz <[email protected]>
Sent by: "valhalla-spec-experts" <[email protected]>
To: [email protected]
Cc:
Subject: Conditional members
Date: Tue, Mar 29, 2016 3:53 PM
Yet another in a series of disconnected, bottom-up (starting at the VM) memos laying the groundwork for the enhanced generics model.
Basic Problem
=============
It may be desirable, for purposes of expressiveness or migration compatibility, to declare class members that are only members of a specific subset of parameterizations of a generic class. Examples include:
- Reference-specific API assumptions. In our analysis of the Collection classes, we identified various methods that fail to make the jump to any-generics for various reasons. These include methods like Collection.toArray(), whose signature makes no sense for primitive parameterizations, or Map.get(), which uses `null` (not in the domain of primitives) to indicate "not present." We can't take these methods away from reference instantiations, but we don't want to propagate them into primitive instantiations.
- Better implementations enabled by known type parameters. Generic classes will provide generic implementations, but sometimes better implementations are possible when concrete types are known. In this case, an implementation would provide a generic implementation and zero or more implementations that are restricted to more specific implementations.
- Functionality available only on specific implementations. For example, List<int> could have a sum() method even though sum() does not make sense on all instantiations. (This is the declaration-site version of what C# enables at the use site with extension methods -- allowing methods to be injected into types, rather than classes.)
We've not yet spent a lot of time identifying the proper way to surface this in the language. For methods, one possibility is to use receiver parameters (added in Java SE 8) to qualify the receiver type:
int sum(List<int> this) { ... }
This gets the point across clearly enough (and is analogous to how C# does extension methods), but has several drawbacks: doesn't scale to fields, nor does it scale well to a conditional-membership model that is anything other than "I am a member of parameterization X". (Where this might fall down, for example, would be when we want members declared as "I am *not* a member of parameterization X".)
Note that in the second motivating example, there will be two members signatures with the same name and signature; we want one to take precedence over the other.
We call these "conditional" or "restricted" members.
Classfile Strawman
==================
Here's a strawman of how we might represent this at the VM level.
We define a new attribute, `Where`, which can be applied to instance fields, instance methods, and constructors:Where {// refers to a ParamType constant
u2 name_index;
u4 length;
u2 restrictionDomain;
}
The restriction domain indicates the parameterization to which this member is restricted; in the absence of Where attribute, it is assumed to be ThisClass<any, any, ...>.
When loading a parameterization of a generic class, we perform an applicability check for each member as we encounter it; in the model outlined here, this is a straight subtyping check of the current parameterization against the restriction domain.
It is possible there could be duplicate applicable methods; this arises when we have a specialization-specific "override", as in:
class Foo<any T> {
// total method m(T)
void m(T t) { }
// Specialization of m(T) for T=int
void m(Foo<int> this, int i) { ... }
}
When we find a duplicate applicable member, we perform a "more specific" check comparing the restriction domains; in this case, the second method has a restriction domain of Foo<int>, which is more specific than the (implicit) Foo<any> restriction domain of the generic method, so we prefer the second member.
This procedure is strictly linear; as each member is read from the classfile, we can make a quick determination as to whether to keep or discard it; if we keep it, we might replace it later with a more specific one as we find it. Modulo cases where there are multiple applicable overloads that are equally specific, it is also deterministic; whether we find the generic version of m() or its specialization first, we'll end up with the same set of members.
If there are duplicate applicable members in a classfile where neither's restriction domain is more specific than the other's, then the VM is permitted to make an arbitrary choice (as they are both applicable and equally specific.) The static compiler can work to filter out such situations, if desired, such as imposing a "meet rule"; if we had:
void foo(Foo<int,any> this)
void foo(Foo<any,int> this)
a meet rule would require the additional overload
void foo(Foo<int,int> this)
