The current stacking discussion is motivated by several factors:

 - experiences prototyping both B2 and B3
 - recently discovered hardware improvements in atomic operations (e.g., Intel's recent specification strengthening around 128-bit vector loads and stores)  - further thought on the consequences of the B2/B3 model, particularly with regard to tearing

The B2/B3 split was a useful proxy during prototyping, with each being built around a known use case: B2 around value-based classes, and B3 around numeric abstractions.  My main objection is twofold: there are gratuitous-seeming differences in performance model (B3s flatten much better currently), which puts users to bad choices between semantics and performance, and the degree to which tearing is hidden behind some other proxy ("primitive-ness", non-nullity, etc), which is likely to surprise users when invariants are checked in the constructor but not necessarily obeyed at runtime.  I want the observed behavioral distinctions between buckets to be clearly related to their semantic differences, and we're not there yet.

The differences in flattening and performance between the current B2/B3 derives directly from the possibility of tearing. When tearing is unacceptable, we are likely to fall back on using indirections to make loads and stores of references atomic (the "non-flat" option); even where we are able to gain some flattening through compiler heroics (the "low flat" option), these hit the ceiling pretty fast (we're unlikely to get above 128 bits any time soon, and may need at least one bit for null) and these also have other costs (wider loads and stores means more data movement and more register shuffling, in addition to the complexity of the required compiler heroics.)  Full-flat requires tearing.  But I don't see an intrinsic reason (yet) while we can't have full-flat for VBCs like Optional.

The most encouraging direction is to factor atomicity out of the bucket model.  We can make both buckets (VBC and primitive-like) atomic by default; this still gets us all the calling convention optimizations, and for very small values (such as single field ones, like Optional), we can probably achieve full flattening in the heap, and more flattening for small-ish values with low-flat heroics.  We can allow both buckets to opt into non-atomicity, which unlocks full-flat layout in the heap, with the only difference being whether we have to perturb the representation to make null representable.

This gets us to something like:

    [ atomic | non-atomic ] __value class B2 { }
    [ atomic | non-atomic ] __primitive class B3 { }

There are many bikesheds here, including the spelling of all these things, and whether or not we say "class" or "struct" or "primitive" or nothing at all, or whether these work with records, but painting can come later.   There are also many other decisions to make, but I'll observe several properties we've already gained by this stacking:

 - non-atomicity is explicit, rather than hiding it behind "primitive" or "non-nullable" or "zero-happy"  - non-atomicity is orthogonal, which means that the performance difference between B2 and B3 (or B3.val and B3.ref), for either polarity of atomicity, is exclusively that imposed by the null-encoding requirement  - safe by default, can opt into more performance by opting out of some safety  - non-atomic sounds "just scary enough" to make people think twice, or at least learn what non-atomic means

Atomicity is only needed when a class has cross-field invariants (or when it's construction API varies significantly from its representation.)  Numeric classes like Complex have no invariants, and Rational has only single-field invariants, but classes like IntRange would have cross-field invariants.  In cases where the VM can provide atomicity for free (e.g., single-field classes), it wouldn't make a difference.

If we further opt for Kevin's "ref is default" proposal, then we add another:

 - All unadorned type names are reference types

Separately, I think we can reconsider where we spend the "value" keyword.  Previously "value" meant "non-identity", but I think it is better spent meaning "has a value projection", which leads us to the minor reshuffling presented yesterday:

    class B1 { }                 // ref only, == based on identity
    value-based class B2 { }     // ref only, == based on state
value class B3 { }           // Has ref and val projections

This affirms B2 as "value-lite", connects to the term we colonized in Java 8 for "classes that have value-like semantics", and moves away from "primitive".

Let's work through Kevin's examples here:

 - Rational.  Here, the default value is particularly bad (denominator should not be zero).  This leads to an uncomfortable choice; choose B2, or choose B3 and deal with the DBZE as "user error" when it happens.  Internal methods (e.g., multiply two rationals) can treat the default value as "0/1" instead and produce a valid rational, but any code that pulls out the denominator and operate on it externally will confront the zero anyway.  Whichever way one chooses, people will complain "but that's bad".  Rational is interesting because it _has_ a sensible default, it is just not the zero representation.  - EmployeeId.  Similar, but maybe more tolerable to treat as a B2, and doesn't require atomicity.
 - Instant.  Seems this is a (probably non-atomic) B2.
 - Complex.  Solid non-atomic B3.
 - Optional, OptionalInt, etc.  In a world where B3 is ref-default, these can be B3; otherwise B2.
 - IntRange: atomic B3 (cross-field invariant.)

There are lots of other things to discuss here, including a discussion of what does non-atomic B2 really mean, and whether there are additional risks that come from tearing _between the null and the fields_.  I'll address that in a separate mail, but I think that factoring out atomic into its own explicit thing is a pure win, and that in turn exposes some sensible terminology shuffling in the other buckets.

Also, bikeshed topics to cover (please, let's not let this drown the discussion):
 - How to spell atomic / non-atomic
 - How to spell B2 and B3
 - How to spell .ref and .val
 - ref-default vs val-default for B3
   - if we go ref-default, reconciling this with universal generics
   - reconciling this with nullable types

Reply via email to