As I mentioned yesterday, the high order bit here is how we describe a
class whose (null-restricted) instances can tolerate (and possibly even
encourage) uninitialized use, just as the primitives do today. Ignoring
the surface syntax, what we really need is an evocative term for such a
class. This term has to be useful and evocative to multiple participants:
- The author of a class, who is making a decision about whether the
zero state represents a sensible default.
- The client of a class, who may exploit the fact that instances may
be safely used uninitialized, or who may want to reason about flattening.
- The specification / descriptive documents, which will need a way to
talk about "classes that are friendly to uninitialized use."
This concept is made more difficult because this property will only have
observable effects for variables with null-restricted types.
On 3/28/2023 3:13 PM, Brian Goetz wrote:
The recent update of JEP 401 contained a number of refinements to the
user model, specifically, separating the primitive/reference
distinction into a number of smaller distinctions (e.g., nullable vs
non-nullable, optional vs required construction.) Overall this has
been a very positive step forward.
We still have a need for the distinction between what we've been
calling B2 and B3; JEP 401 currently frames that in terms of
"construction is optional." This is a big step forward; indeed, the
key difference between them is whether the class _needs_ the
"variables start out as null, and all instances are created by
constructors" protection, or whether it admits the lighter-weight
initialization protocol of "there's a a standard zero value, null-free
variables are initialized to that" that primitives enjoy today. (Note
that B3 classes don't require this lighter protocol, they merely
enable it, much as primitives all give you the option of boxing to get
the full conservative initialization protocol.)
The idea of framing this as "construction is optional" is a good one,
but the expression of it proposed in JEP 401 feels "not quite there".
In this note I'll propose an alternative presentation, but the main
goal here is around terminology and user model rather than syntax (so
please keep the syntax agitation to a reasonable level.)
The key distinction between B2 and B3 is that B3 has a _default value_
which the VM can summon at will. This enables non-nullable heap
variables to be flattened, because we can initialize these the same
way we initialize other fields and array elements. Further, that
default value is highly constrained; it is a physical zero, the result
of initializing all fields to their default value.
Flattening is of course a goal, but it is not something that exists in
the programming model -- its just an optimization. What exists in the
programming model is the default value, and what this unlocks is the
possibility for variables to be _implicitly initializated_.
Reference-typed variables today are _explicitly initialized_;
variables start out null and have to be initialized with a constructed
value. A class with a default value has the option (opted in through
null-exclusion) for its variables to be implicitly initialized, which,
like primitives, means that they start out with a valid default value,
and can be further assigned to.
Framed this way, the Valhalla performance story simplifies to:
- Give up identity, get flattening on the stack;
- Further give up explicit initialization, get flattening for small
objects on the heap;
- Further give up atomicity, get flattening for larger objects on the
heap.
Giving up explicit initialization entails both the class opting out of
explicit initialization, _and_ the variable opting out of nullity.
The key new terminology that comes out of this is implicit vs explicit
initialization.
Syntactically, my preference is to indicate that the default value can
be summoned by giving a value class a _default constructor_:
value class Complex {
public final double re, im;
public default Complex();
}
A default constructor has no arguments, no body, no throws clause, and
implicitly initializes all fields to their default values. Unlike
identity classes, value classes don't get constructions implicitly; a
value class must declare at least one constructor, default or
otherwise. This replaces the idea of "optional constructor", which is
a negative statement about construction ("but you don't have to call
me"), with a more direct and positive statement that there is a
_default constructor_ with the required properties.
Note that this is similar to the existing concept of "default
constructor", which you get for free in an identity class if you don't
specify any constructors. It is possible we can unify these features
(and also with constructors in "agnostic" abstract classes), but first
let's work out what it would mean in value classes, and see if we like it.
In this model, a B3 class is just a value class with a default
constructor -> a default constructor means that you have the choice of
implicit or explicit initialization -> non-nullity at the use site
opts into implicit initialization -> B3! gets flattening (for small
layouts.)