----- Original Message -----
From: "Brian Goetz" <[email protected]>
To: "Remi Forax" <[email protected]>
Cc: "valhalla-spec-experts" <[email protected]>
Sent: Wednesday, July 20, 2022 7:34:04 PM
Subject: Re: The storage hint model
Yes, i know, we have already discuss several models like that. But i think, it's
a good idea to re-examine those because i believe they are more attractive
today.
Indeed, this has come up several times. It is attractive to think of flattening
entirely as a ’storage class’, and fair to reexamine it (this also came up in
an internal discussion recently) but I think in the end this still will be a
choice that we regret.
The main issue with the .val model is that it presents two *types* to the user
while we really want is mostly to flatten the storage and have a precise the
method calling convention.
Those two goals are not equals, the first is far more important than the second,
to the point where the coding guideline proposed by Brian is to use .ref for
the parameters and .val for the fields and arrays.
FTR, the motivation for the the guideline here is “use .val where it makes the
most difference.” There’s nothing *wrong* with using val types on the stack,
you just don’t get the enormous payback you do with heap variables. But I can
imagine — especially in a specialized-generics world — that there is value to
using .val in APIs as well, because it carries the semantic “not null”
information as well as the flattening hint.
T.flat carries the same semantics, the difference is that you have to
explicitly use T.flat where you want the flattening in the generic code.
class Container<T> {
private T.flat value; // here
public void set(T.flat value) { // but also here
this.value = value;
}
public T.flat get() { // and here too
return value;
}
}
so yes it makes the generic code more cumbersome to write but it also makes
generic classes easier to use because the writer of the generics decide what
can be flattened (or not) and not the user of the generics.
We still need .val and .ref to be able to specialize generics, right ? No, i
don't think so, we technically do not have to pass a .val as type argument to
be able to specialize a generic class, we just need to pass a type argument
that can be flatten if it's possible.
Here’s where I disagree. If field declaration and array creation expressions
were the only places you needed to say .val, I’d be much more sympathetic to
the container-properties model. But in a world with specialized generics, we
want to flow the types throughout, not only to field layout, but flowing the
non-null constraint to the JIT, etc. The `T.flat` approach will feel like a
hack, because it is, and as an unbonus, people will forget almost all the time
because having to select a storage class for an abstractly typed variable will
feel unnatural.
People will forget T.flat as much as they will forget C.flat (C.val if you
prefer), that's true, but that the price to pay to be safe by default, in both
cases.
If you want to "fix" the potential missing T.flat, it's the same fix as with a
potential missing C.flat, have a way to declare a value class flat by default at
declaration site. But that's a separate discussion.
When I say ArrayList<Foo.val>, I want the properties of
Foo.val to flow to *all* the places where a T is being moved around.
Maybe you want or maybe you don't, here is an interesting implementation of
ArrayList
public classs ArrayList<E> {
private E[] array;
private int size;
public ArrayList() {
array = new E.flat[16]; // ahah, flat by default !
}
public boolean add(E element) { // E is not flat
if (element == null && !array.getClass().isNullable()) {
var newArray = new E[array.length]; // need to store null, use a
nullable array
System.arraycopy(array, 0, newArray, 0, array.length);
array = newArray;
}
if (array.length == size) {
array = Arrays.copyOf(array, size * 2);
}
array[size++] = element:
return true;
}
}
It starts with a flat array and if an element null is added, it "unflat" itself.
This implementation is interesting because once recompiled with the new generics, a
new ArrayList<Integer>() will use a flatten array by default.
I've no idea about the performance of such kind of implementations, but using
T.flat give better control on what is flattenable or not in the implementation.
(This scheme rests on a clever but implicit assumption: that `T.flat` really
means “as flat as T can be”, which for a ref, is “not at all.” Its clever, but
for this reason `T.flat` is kind of a misnomer.).
If it's a value class, T.flat can still flatten the value if the size is <= 128
bits but yes, T.flat means as flat as T can be.
we can write instead
value class C {
// ...
}
class Container<T> {
private T.flat value;
Yeah, this is where you lose me. When you’re writing a generic class like
ArrayList<T>, you’re abstracted from the details of heap layout, and it seems
overwhelmingly likely you’d forget to say T.flat somewhere. It also feels very
“nonparametric”, because we’ve created a second, ad-hoc channel through which
information flows, and that channel is “bumpier". But its worse than that,
because there’s less type information in the program, and therefore the VM has
to make more conservative assumptions about nullity.
This have been true with the previous proposed storage hint models, but unlike
those, this model allows parameters to be declared as T.flat.
I think it is the missing piece so the VM as enough information by propagating
the T.flat so it does not need to make conservative assumptions.
I get what you are trying to accomplish; the ref/val distinction feels like it
is almost something we can get rid of. But I think swapping it for a storage
class model is worse, because it is asking users to think about low-level
details in more places, rather than using types and having the information flow
with the types.
In more places inside the generic code, in less places inside the user code.
It's a trade i'm happy to make.
And as you point out, it means there are more possible ways
nulls can get deeper into the system before NPEing.
yes, it can be as late as reaching a putField but it's because as a class
writer you have more control.
For example with List.of() which never allows null, delaying the NPE may provide
better error messages, a requireNonNull may be better than having a NPE at the
callsite like List.<C.val>of(null) will do.
Rémi