----- Original Message -----
> From: "daniel smith" <[email protected]>
> To: "valhalla-spec-experts" <[email protected]>
> Sent: Mardi 5 Octobre 2021 01:34:37
> Subject: Addressing the full range of use cases
> When we talk about use cases for Valhalla, we've often considered a very broad
> set of class abstractions that represent immutable, identity-free data. JEP
> 401
> mentions varieties of integers and floats, points, dates and times, tuples,
> records, subarrays, cursors, etc. However, as shorthand this broad set often
> gets reduced to an example like Point or Int128, and these latter examples are
> not necessarily representative of all candidate value types.
yes !
>
> Specifically, our favorite example classes have a property that doesn't
> generalize: they'll happily accept any combination of field values as a valid
> instance. (In fact, they're even happy to accept any combination of *bits* of
> the appropriate length.) Many candidate primitive classes don't have this
> property—the constructors do important validation work, and only certain
> combinations of fields are allowed to represent valid instances.
I now believe the mantra "code like a class acts as an int" is harmful.
A class provides encapsulation, an int has no encapsulation, there is a
mismatch.
>
> Related areas of concern that we've had on the radar for awhile:
>
> - The "all zeros is your default value" strategy forces an all-zero instance
> into the class's value set, even if that doesn't make sense for the class.
> Many
> candidate classes have no reasonable default at all, leading naturally to wish
> for "null is your default value" (or other, more exotic, strategies involving
> revisiting the idea that every type has a default value). We've provided
> 'P.ref' for those use sites that *need* null, but haven't provided a complete
> story for value types that want it to be *their* default value, too.
>
> - Non-atomic heap updates can be used to create new instances that arbitrary
> combine previously-validated instances' fields. There is no guarantee that the
> new combination of fields is semantically valid. Again, while there's
> precedent
> for this with 'double' and 'long' (JLS 17.7), those are special cases that
> don't generalize—any combination of double bit fields is *still a valid
> double*. (This is usually described as "tearing", although JLS 17.6 has
> something else in mind when it uses that word...) The language provides
> 'volatile' as a use-site opt-in to atomicity, and we've toyed with a
> declaration-site opt-in as well. But object integrity being "off" by default
> may not be ideal.
>
> - Existing class types like LocalDate are both nullable and atomic. These are
> useful properties to preserve during migration; nullability, in particular, is
> essential for source compatibility. We've provided reference-default
> declarations as a mechanism to make reference types (which have these
> properties) the default, with 'P.val' as an opt-in to value types. But in
> doing
> so we take away the many benefits of value types by default, and force new
> code
> to work with the "bad name".
The existing class LocalDate is not atomic per se, atomic in Java implies
volatile and currently if a LocalDate field is updated in one thread, another
thread may never see that update.
LocalDate is currently not tearable, a QLocalDate; is tearable in case of racy
code.
And yes, nullablibilty is a huge compatibility issue.
>
> While we can provide enough knobs to accommodate all of these special cases,
> we're left with a complex user model which asks class authors to make n
> different choices they may not immediately grasp the consequences of, and
> class
> users to keep 2^n different categories straight in their heads.
yes !
>
> As an alternative, we've been exploring whether a simpler model is workable.
> It
> is becoming clear that there are (at least) two clusters of uses for value
> types. The "classic" value types are like numerics -- they'll happily accept
> any combination of field values as a valid instance, and the zero value is a
> sensible (often the best possible) default value. They make relatively little
> use of encapsulation. These are the ones that best "work like an int." The
> "encapsulated" value types are those that are more like typical aggregates
> ("codes like a class") -- their constructors do important validation work, and
> only certain combinations of fields are allowed to represent valid instances.
> These are more likely to not have valid zero values (and hence want to be
> nullable).
I agree.
>
> Some questions to consider for this approach:
>
> - How do we group features into clusters so that they meet the sweet spot of
> user expectations and use cases while minimizing complexity? Is two clusters
> the right number? Is two already too many? (And what do we call them? What
> keywords best convey the intended intuitions?)
Two is too many, see below.
>
> - If there are knobs within the clusters, what are the right defaults? E.g.,
> should atomicity be opt-in or opt-out?
I prefer opt-in, see below.
>
> - What are the performance costs (or, in the other direction, performance
> gains)
> associated with each feature? For certain feature combinations, have we
> canceled out the performance gains over identity classes (and at that point,
> is
> that combination even worth supporting?)
Good question ...
Let's me reformulate.
But before, we can not that we have 3 ways of specifying primitive class
features,
- we can use different types, by example, Foo.val vs Foo.ref
- we can have container attributes (opt-in or opt-out), by example, declaring a
field volatile make it non tearable
- we have runtime knobs, like an array can allow null or not.
First the problem, as you said, if we have a code like the one just below,
the field primFoo is flattened so primFoo.someValue is 0 bypassing the
constructor.
primitive class PrimFoo {
PrimFoo(int someValue) {
if (someValue == 0) { throw new IAE(); }
this.someValue = someValue;
}
int someValue;
}
class Foo {
PrimFoo primFoo;
}
I believe we should try to make a primitive class nullable and flattenable by
default, so have one tent pole and have knobs for 2 special cases, non-nullable
primitive classes (for use-cases like Complex) and non flattenable classes when
stored in field/array cell (the use case "atomicity").
So a primitive class (the default):
- represent the null value (initialized) with a supplementary field when stored
on heap, and a supplementary register if necessary
- is tearable in case of racy code (don't write racy code)
- is represented by a Q-type in the bytecode for full flattening or a L-type
using a pointer to be backward compatible
- is represented by different java.lang.Class (one for the Q-type, the primary
class and one for the L-Type, the secondary class)
I think that a Q-type can be backward compatible with a L-type in the method
descriptors, a Q-type should be represented as a L-type + an out-of-band bit
saying that this is a Q-type so it should be loaded eagerly (like we use
out-of-band attributes for the generic specialization). Obviously, the way to
create a Q-type (default + with + with) is still different from a L-type (new +
dup + invokespecial) so creating a Q-type instead of a L-type is not backward
compatible. So the VM has to generate several method entry points for method
that is annotated with the attribute saying there is Q-type in the descriptor
(or override a method with such attribute).
The special cases:
1) non-nullable when flattened.
In believe that all primitive type should be nullable but that a user should
have a knob to choose that a primitive class is non-nullable when flattened.
So the VM will throw a NPE, if a field/or an array is annotated with
something saying that null is not a supported value.
For array, we already have that bit at runtime, i believe we should have a
modifier for field saying that null is a possible value when flattened.
2) non tearable.
We already support the modifier 'volatile' to say that a primitive class
should be manipulated by pointer.
Should we have a declaration site keyword, i don't know. It's perhaps a
corner case where not using a primitive class is better.
To summarize, i believe that if a primitive class is always nullable (apart
some opt-in special cases), it can be backward compatible (enough) to transform
all value based class to primitive class and just let the new version of javac
to replace all the L-type by Q-type in the method descriptor (using an
atttribute) without asking the user to think too much about it (apart if the
code is racy).
regards,
Rémi