(somehow two versions of this got sent, along with some cut and paste from another thread; please disregard whatever looks weird.)
> On Apr 27, 2022, at 5:50 PM, Brian Goetz <brian.go...@oracle.com> wrote: > > Let me try and put some more color on the bike shed (but, again, let’s focus > on model, not syntax, for now.) > > We have two axes of variation we want to express with non-identity classes: > atomicity constraints, and whether there is an additional zero-default > companion type. These can be mostly orthogonal; you can have either, > neither, or both. We've been previously assuming that "primitiveness" lumps > this all together; primitives get more flattening, primitives can be > non-nullable/zero-default, primitives means the good name goes to the "val" > type. Primitive-ness implicitly flips the "safety vs performance" priority, > which has been bothering us because primitives also code like a class. So we > were trying to claw back some atomicity for primitives. > > But also, we're a little unhappy with B2 because B2 comes with _more_ > atomicity than is necessarily needed; a B2 with no invariants still gets less > flattening than a B3. That's a little sad. And also that it seems like a > gratuitous difference, which makes the user model more complicated. So we’re > suggesting restacking towards: > > - Value classes are those without identity > - Value classes can be atomic or non-atomic, the default is atomic (safe by > default) > - Value classes can further opt into having a "val" projection (name TBD, val > is probably not it) > - Val projections are non-nullable, zero-default — this is the only difference > - Both the ref and val projections inherit the atomicity constraints of the > class, making atomicity mostly orthogonal to ref/val/zero/null > > Example: classic B2 > > value class B2a { } > > Because the default is atomic, we get the classic B2 semantics -- no > identity, but full final field safety guarantees. VM has several strategies > for flattening in the heap: single-field classes always flattened (“full > flat”), multi-field classes can be flattened with "fat load and store" > heroics in the future (“low flat”), otherwise, indirection (“no flat”) > > Example: non-atomic B2 > > non-atomic value class B2n { } > > Here, the user has said "I have no atomicity rquirements." A B2n is a loose > aggregation of fields that can be individually written and read (full B3-like > flattening), with maybe an extra boolean field to encode null (VM's choice > how to encode, could use slack pointer bits etc.) > > Example: atomic B3 > > zero-capable value class B3a { } > > This says I am declaring two types, B3a and B3a.zero. (The syntax in this > quadrant sucks; need to find better.) B3a is just like B2a above, because we > haven’t activated the zero capability at the use site. > B3a.zero/val/flat/whatever is non-nullable, zero-default, *but still has full > B2-classic atomicity*. With the same set of flattening choices on the part > of the VM. > > Example: full primitive > > non-atomic zero-capable value class B3n { } > > Here, B3n is like B2n, and B3n.zero is a full classic-B3 Q primitive with > full flattening. > > So: > > - value-ness means "no identity, == means state equality" > - You can add non-atomic to value-ness, meaning you give up state integrity > - You can orthogonally add zero-capable to value-ness, meaning you get a > non-null, zero-happy companion, which inherits the atomic-ness > > Some of the characteristics of this scheme: > > - The default is atomicity / integrity FOR ALL BUCKETS (safe by default) > - The default is nullability FOR ALL BUCKETS > - All unadorned type names are reference types / nullable > - All Val-adorned type names (X.val) are non-nullable (or .zero, or .whatever) > - Atomicity is determined by declaration site, can’t be changed at use site > > The main syntactic hole is finding the right spelling for "zeroable" / .val. > There is some chance we can get away with spelling it `T!`, though this has > risks. > > Spelling zero-happy as any form of “flat” is probably a bad idea, because B2 > can still be flat. > > A possible spelling for “non-atomic” is “relaxed”: > > relaxed value class B3n { } > > Boilerplate-measurers would point out that to get full flattening, you have > to say three things at the declaration site and one extra thing at the use > site: > > relaxed zero-happy value class Complex { } > … > Complex! c; > > If you forget relaxed, you might get atomicity (but might not cost anything, > if the value is small.) If you forget zero-happy, you can’t say `Complex!`, > you can only say Complex, and the compiler will remind you. If you forget > the !, you maybe get some extra footprint for the null bit. None of these > are too bad, but the verbosity police might want to issue a warning here. > > It is possible we might want to flip the declaration of zero-capable, where > classes with no good default can opt OUT of the zero companion, rather than > the the other way around: > > null-default value class LocalDate { } > > which says that LocalDate must use the nullable (LocalDate) form, not the > non-nullable (LocalDate.val/zero/bang) form. > > > On 4/22/2022 2:24 PM, Brian Goetz wrote: > I think I have a restack of Dan's idea that feels like fewer buckets. > > We have two axes of variation we want to express with flattenable types: > atomicity constraints, and whether there is an additional zero-default > companion type. > > We've been assuming that "primitiveness" lumps this all together; primitives > get more flattening, primitives can be non-nullable/zero-default, primitives > means the good name goes to the "val" type. Primitive-ness implicitly flips > the "safety vs performance" priority, which is bothering us because > primitives also code like a class. So we're trying to claw back some > atomicity for primitives. > > But also, we're a little unhappy with B2 because B2 comes with _more_ > atomicity than is necessarily needed; a B2 with no invariants still gets less > flattening. That's a little sad. Let's restack the pieces (again). > > - Value classes are those without identity > - Value classes can be atomic or non-atomic, the default is atomic (safe) > - Value classes can further opt into having a "val" projection (name TBD, val > is probably not it) > - Val projections are non-nullable, zero-default > - Both the ref and val projections inherit the atomicity constraints of the > class, making atomicity mostly orthogonal to ref/val/zero/null > > Example: classic B2 > > value class B2 { } > > Because the default is atomic, we get the classic B2 semantics -- no > identity, but full final field safety guarantees. VM has several strategies > for flattening in the heap: single-field classes always flattened, > multi-field classes can be flattened with "fat load and store" heroics in the > future, otherwise, indirection. > > Example: non-atomic B2 > > non-atomic value class B2a { } > > Here, the user has said "I have no atomicity rquirements." A B2a is a loose > aggregation of fields that can be individually written and read (full B3-like > flattening), with maybe an extra boolean field to encode null (VM's choice > how to encode.) > > Example: atomic B3 > > zero-capable value class B3a { } > > This says I am declaring two types, B3a and B3a.zero. (These names suck; > need better ones.) B3a is just like B2 above. B3a.zero is non-nullable, > zero-default, *but still has full B2-classic atomicity*. With the same set > of flattening choices. > > Example: full primitive > > non-atomic zero-capable value class B3b { } > > Here, B3b is like B2a, and B3b.zero is a full classic-B3 Q primitive with > full flattening. > > > So the stacking is: > > - value-ness means "no identity, == means state equality" > - You can add non-atomic to value-ness, meaning you give up state integrity > - You can orthogonally add zero-capable to value-ness, meaning you get a > non-null, zero-happy companion > > This is starting to feel more honest.... > > > > > > On 4/19/2022 6:45 PM, Brian Goetz wrote: > By choosing to modify the class, we are implicitly splitting into Buckets 3a > and 3n: > > - B2 gives up identity > - B3a further gives up nullity > - B3n further gives up atomicity > > Which opens us up to a new complaint: people didn't even like the B2/B3 split > ("why does there have to be two"), and now there are three. > > Given that atomic/non-atomic only work with primitive, maybe there's a way to > compress this further? > > On 4/19/2022 6:25 PM, Dan Smith wrote: > On Apr 19, 2022, at 2:49 PM, Brian Goetz <brian.go...@oracle.com> > wrote: > > So, what shall we do when the user says non-atomic, but the constructor > expresses a multi-field invariant? > > Lint warning, if we can detect it and that warning is turned on. > > > On Apr 19, 2022, at 3:22 PM, Brian Goetz <brian.go...@oracle.com> > wrote: > > Stepping back, what you're saying is that we manage atomicity among a subset > of fields by asking the user to arrange the related fields in a separate > class, and give that class extra atomicity. If we wanted to express > ColoredDiagonalPoint, in this model we'd say something like: > > non-atomic primitive ColoredDiagonalPoint { > private DiagonalPoint p; > private Color c; > > private atomic primitive DiagonalPoint { > private int x, y; > > DiagonalPoint(int x, int y) { > if (x != y) throw; > ... > } > } > } > > Right? > > Yep. Good illustration of how just providing a class modifier gives > programmers significant fine-grained control. > > > We exempt the single-field classes from having an opinion. We could also > exempt primitive records with no constructor behavior. > > Yeah, but (1) hard to identify all assumed invariants—some might appear in > factories, etc., or informally in javadoc; and (2) even in a class with no > invariants, it's probably useful for the author to explicitly acknowledge > that they understand tearing risks. > > > What it gives up (without either a change in programming model, or compiler > heroics), is the ability to correlate between user-written invariants and the > corresponding atomicity constraints, which could guide users away from > errors. Right? > > Right. Could still do that if we wanted, but my opinion is that it's too much > language surface for the scale of the problem. If we did have additional > construction constraints, I'd prefer that atomic primitives allow full > imperative construction logic & encapsulation. > > This feels analogous to advanced typing analyses that might prove certain > casts to be safe/unsafe. Sure, the language could try to be helpful by > implementing that analysis, but it would add lots of complexity, and > ultimately it's either a best-effort check or annoyingly restrictive. > >> On Apr 27, 2022, at 2:51 PM, Dan Heidinga <heidi...@redhat.com> wrote: >> >> I'm trying to understand how this refactoring fits the VM physics. >> >> In particular, __non-atomic & __zero-ok fit together at the VM level >> because the VM's natural state for non-atomic (flattened) data is zero >> filled. When those two items are decoupled, I'm unclear on what the >> VM would offer in that case. Thoughts? >> >> How does "__non-atomic __non-id class B2a { }" fit with the "no new >> nulls" requirements? >> >> --Dan >> >> On Wed, Apr 27, 2022 at 12:45 PM Brian Goetz <brian.go...@oracle.com> wrote: >>> >>> Here’s some considerations for stacking the user model. (Again, please >>> let’s resist the temptation to jump to the answer and then defend it.) >>> >>> We have a stacking today which says: >>> >>> - B1 is ordinary identity classes, giving rise to a single reference type >>> - B2 are identity-free classes, giving rise to a single reference type >>> - B3 are flattenable identity-free classes, giving rise to both a reference >>> (L/ref) and primitive (Q/val) type. >>> >>> This stacking has some pleasant aspects. B2 differs from B1 by “only one >>> bit”: identity. The constraints on B2 are those that come from the lack of >>> identity (mutability, extensibility, locking, etc.) B2 references behave >>> like the object references we are familiar with; nullability, final field >>> guarantees, etc. B3 further makes reference-ness optional; reference-free >>> B3 values give up the affordances of references: they are zero-default and >>> tearable. This stacking is nice because it can framed as a sequence of >>> “give up some X, get some Y”. >>> >>> People keep asking “do we need B2, or could we get away with B1/B3”. The >>> main reason for having this distinction is that some id-free classes have >>> no sensible default, and so want to use null as their default. This is a >>> declaration-site property; B3 means that the zero value is reasonable, and >>> use sites can opt into / out of zero-default / nullity. We’d love to >>> compress away this bucket but forcing a zero on classes that can’t give it >>> a reasonable interpretation is problematic. But perhaps we can reduce the >>> visibility of this in the model. >>> >>> The degrees of freedom we could conceivably offer are >>> >>> { identity or not, zero-capable or not, atomic or not } x { use-site, >>> declaration-site } >>> >>> In actuality, not all of these boxes make sense (disavowing the identity of >>> an ArrayList at the use site), and some have been disallowed by the >>> stacking (some characteristics have been lumped.) Here’s another way to >>> stack the declaration: >>> >>> - Some classes can disavow identity >>> - Identity-free classes can further opt into zero-default (currently, B3, >>> polarity chosen at use site) >>> - Identity-free classes can further opt into tearability (currently, B3, >>> polarity chosen at use site) >>> >>> It might seem the sensible move here is to further split B3 into B3a and >>> B3b (where all B3 support zero default, and a/b differ with regard to >>> whether immediate values are tearable). But that may not be the ideal >>> stacking, because we want good flattening for B2 (and B3.ref) also. >>> Ideally, the difference between B2 and B3.val is nullity only (Kevin’s >>> antennae just went up.) >>> >>> So another possible restacking is to say that atomicity is something that >>> has to be *opted out of* at the declaration site (and maybe also at the use >>> site.) With deliberately-wrong syntax: >>> >>> __non-id class B2 { } >>> >>> __non-atomic __non-id class B2a { } >>> >>> __zero-ok __non-id class B3 { } >>> >>> __non-atomic __zero-ok __non-id class B3a { } >>> >>> In this model, you can opt out of identity, and then you can further opt >>> out of atomicity and/or null-default. This “pulls up” the >>> atomicity/tearaiblity to a property of the class (I’d prefer safe by >>> default, with opt out), and makes zero-*capability* an opt-in property of >>> the class. Then for those that have opted into zero-capability, at the use >>> site, you can select .ref (null) / .val (zero). Obviously these all need >>> better spellings. This model frames specific capabilities as modifiers on >>> the main bucket, so it could be considered either a two bucket, or a four >>> bucket model, depending on how you look. >>> >>> The author is in the best place to make the atomicity decision, since they >>> know the integrity constraints. Single field classes, or classes with only >>> single field invariants (denominator != 0), do not need atomicity. Classes >>> with multi-field invariants do. >>> >>> This differs from the previous stacking in that it moves the spotlight from >>> _references_ and their properties, to the properties themselves. It says >>> to class writers: you should declare the ways in which you are willing to >>> trade safety for performance; you can opt out of the requirement for >>> references and nulls (saving some footprint) and atomicity (faster access). >>> It says to class *users*, you can pick the combination of characteristics, >>> allowed by the author, that meet your needs (can always choose null default >>> if you want, just use a ref.) >>> >>> There are many choices here about “what are the defaults”. More opting in >>> at the declaration site might mean less need to opt in at the use site. Or >>> not. >>> >>> (We are now in the stage which I call “shake the box”; we’ve named all the >>> moving parts, and now we’re looking for the lowest-energy state we can get >>> them into.) >>> >> >