For LW10, one of our goals is to support interactions between value types and
erased generics by having some form of a nullable value type.
The needs of the language factor heavily into the JVM design. We're not ready
to commit to language-level details, but it's likely that the language will
support nullable and non-nullable variations of the types declared by value
classes; and these variations will probably be supported in most places that
types can appear.
More generally, the language may support up to three different flavors of
nullability on some or all types:
- null-free: a type that does not include null (could be spelled Foo!)
- null-permitting: a type that allows but ignores nulls (could be spelled Foo~)
- null-checked: a type that allows and checks for nulls (could be spelled Foo?)
(Please note that this is placeholder syntax. There are lots of ways to map
this to real syntax. Unadorned names will map to one of these; it's possible
that migrating a class to be a value class will change the interpretation of
its unadorned name.)
Null-permitting and null-checked types are both "nullable"; the difference is
in how strongly the compiler enforces null checks. ("Null-permitting" is the
existing behavior for types like 'String'; "null-checked" is the style that
requires proof that nulls are absent before dereferencing.)
The other important concept from the language is conversions:
- A widening conversion (or something similar) supports treating a value of a
null-free type as null-permitting or null-checked
- A "null-free conversion" is required to go in the opposite direction, and
includes a runtime null check
- A "nullability conversion", like an unchecked conversion, might allow other
forms of conversions between types involving different nullabilities, including
in their type arguments or array component type.
Turning to the JVM with those language-level concepts in mind, I've put
together the following summary of four main designs we've considered. The goal
here is not to reach a conclusion about which path is best, but to make sure
we're accurately considering all of the implications in each case.
Nullable value types, null-free storage
---------------------------------------
In this approach, we use regular L types to represent value types, and these
types are nullable. Fields and arrays, via some sort of modifier, may choose to
be nullable or null-free.
JVM implications
- Need a mechanism (new opcode?) to indicate that an array allocation is
null-free
- The default value of a field/array depends on whether the "null-free"
modifier is used
- Fields and arrays that are marked null-free can, of course, be flattened
- Stack variables and method parameters/returns may always be null
- A putfield, putstatic, or aastore may fail with an NPE (or maybe ASE)
- JIT can optimistically assume no nulls and scalarize, but must check and
de-opt when a null is encountered
- The "null-free" modifier is only allowed with value class types, and must be
validated early (e.g., to decide on field layout)
Compilation strategy
Val? maps to LVal;
Val~ maps to LVal;
Val! maps to LVal;
The nullability of the type in a field declaration or array creation expression
determines whether the "null-free" modifier is used or not.
Nullability conversions are no-ops; null-free conversions are either compiled
to explicit null checks or are implicit in a invoke*/getfield/putfield.
Language implications
- Null-free value types typically get flattened storage and scalarized
invocations
- Array store runtime checks may include a null check
- Methods may not be overloaded on different nullabilities of the same type
- Null-free parameters/returns may be polluted with nulls due to inconsistent
compilation or non-Java interop—detected with an NPE on storage or dereference
- A conversion from Val~[] to Val![] could be supported, but the result would
not perform the expected runtime checks
Migration implications
- Refactoring a class to be a value class is a binary compatible change (except
where this involves incompatible changes like removing a public constructor);
before recompilation (which may reinterpret some unadorned names), treatment of
nulls does not change
- Changing the nullability of a type is a binary compatible change; library
clients who expect nullable storage may see surprising NPEs or ASEs
Always null-free value types
----------------------------
In this approach, we use regular L types to represent value types, and these
types are null-free. Non-value L types continue to be nullable. A use-site
attribute tracks which class names represent value classes; validation lazily
ensures consistency with the declaration.
JVM implications
- Fields, arrays, and method parameters and returns with value class types can
be flattened/scalarized
- The 'null' verification type is not a subtype of any value class types
- Casts to value class types must fail on 'null' (CCE or NPE)
- At method preparation, field/method resolution, and class loading, a check
similar to class loader constraints ensures that classes agree on value classes
in the descriptor
- Various other vectors for getting data into the JVM should prevent nulls, or
have contracts that allow crashing, etc., if data is corrupted
- Classes in the value classes attribute are allowed to be loaded early (e.g.,
to decide on field layout)
- If the value classes attribute does not mention a value class, it's possible
for variables/fields of that type to be null, but an error will occur when an
attempt is made to load the class or resolve against a class that disagrees
Compilation strategy
Val? maps to Ljava/lang/Object;
Val~ maps to Ljava/lang/Object;
Val! maps to LVal;
Every referenced value class is listed in the value classes attribute.
Nullability conversions are no-ops; null-free conversions are compiled to
checkcasts (even for member access). Casts that target Val?/Val~ compile to a
checkcast guarded by a null check, where null always succeeds.
Language implications
- Null-free value types typically get flattened storage and scalarized
invocations
- Array store runtime checks may include a null check
- Val~[] and Val?[] do not perform array store checks at all—any Object may end
up polluting these arrays (creating arrays of these types might be treated as
an error, like T[])
- Val~ and Val? are overloading-hostile: their use in signatures conflicts with
Object and all other null-permitting/null-checked value types
- Null-permitting/null-checked value type parameters and returns may be
polluted with other types due to inconsistent compilation or non-Java
interop—detected with a CCE on null-free conversion
- A conversion from Val~[] to Val![] cannot be allowed
Migration implications
- Refactoring a class to be a value class is a binary incompatible change due
to inconsistent value class attributes
- Changing from a null-permitting/null-checked to null-free type (or vice
versa) is a binary incompatible change unless there's some form of support for
type migrations
Null-free types with new descriptors
------------------------------------
In this approach, we use regular L types to represent nullable value types, and
introduce other types (spelled, say, with a "K") to represent null-free value
types. K types are subtypes of L types, and casts can be used to convert from L
to K.
JVM implications
- Descriptor syntax needs to support 'K'
- To support K casts, we need ClassRefs that indicate K-ness, a new opcode, or
some other mechanism
- Fields, arrays, and method parameters and returns with K types can be
flattened/scalarized
- The 'null' verification type is not a subtype of K types
- Casts to K types must fail on 'null'
- Various other vectors for getting data into the JVM should prevent nulls, or
have contracts that allow crashing, etc., if data is corrupted
- Classes named by K types are allowed to be loaded early (e.g., to decide on
field layout)
Compilation strategy
Val? maps to LVal;
Val~ maps to LVal;
Val! maps to KVal;
Nullability conversions are no-ops; null-free conversions are either compiled
to explicit casts or are implicit in an invoke*/getfield/putfield.
Language implications
- Null-free value types typically get flattened storage and scalarized
invocations
- Array store runtime checks may include a null check
- Methods may be overloaded with a null-free type vs. a
null-permitting/null-checked type (but null-permitting vs. null-checked is not
allowed)
- Pollution of null-free variables or arrays is impossible
- A conversion from Val~[] to Val![] cannot be allowed
Migration implications
- Refactoring a class to be a value class is a binary compatible change (except
where this involves incompatible changes like removing a public constructor);
before recompilation (which may reinterpret some unadorned names), treatment of
nulls does not change
- Changing from a null-permitting/null-checked to null-free type (or vice
versa), is a binary incompatible change unless there's some form of support for
type migrations
Nullability notations on types
------------------------------
In this approach, we use regular L types to represent value types, and these
types are nullable by default. To indicate that a particular field, array, or
parameter/return is null-free, some form of side notation is used.
(Deliberately using the word "notation" rather than "annotation" or "modifier"
here to avoid committing to an encoding.)
This is similar to "nullable value types, null-free storage", except that the
null-free notation can be used on method parameters/returns.
This is similar to "always null-free value types", except that instead of
tracking value classes in each class file, we track null-free value types per
use site.
This is similar to "null-free types with new descriptors", except that the
notations are not part of descriptors and don't require any explicit
conversions—they are not part of the verification type system.
JVM implications
- Need a mechanism to encode notations, both for descriptors and for array
creations
- The default value of a field/array depends on whether the "null-free"
notation is used
- Fields, arrays, and method parameters and returns that are marked null-free
can be flattened/scalarized
- Stack variables may generally be null, unless a static analysis proves
otherwise
- A putfield, putstatic, aastore, or method invocation may fail with an NPE (or
maybe ASE)
- Method overriding allows nullability mismatches; calls must be able to
dynamically adapt (e.g., through multiple v-table entries and VM-generated
bridges)
- Types marked null-free are allowed to be loaded early (e.g., to decide on
field layout)
Compilation strategy
Where '*' represents a side notation that a type is null-free:
Val? maps to LVal;
Val~ maps to LVal;
Val! maps to LVal;*
Nullability conversions are no-ops; null-free conversions are either compiled
to explicit null checks or are implicit in a invoke*/getfield/putfield.
Language implications
- Null-free value types typically get flattened storage and scalarized
invocations
- Array store runtime checks may include a null check
- Methods may not be overloaded on different nullabilities of the same type
- Pollution of null-free variables arrays, or parameters/returns is impossible
- A conversion from Val~[] to Val![] could be supported, but the result would
not perform the expected runtime checks
Migration implications
- Refactoring a class to be a value class is a binary compatible change (except
where this involves incompatible changes like removing a public constructor);
before recompilation, treatment of nulls does not change
- Changing the nullability of a type is a binary compatible change; library
clients who expect a nullable API may see surprising NPEs or ASEs