Re: [External] : Re: User model stacking

Brian Goetz Wed, 27 Apr 2022 16:17:59 -0700

(somehow two versions of this got sent, along with some cut and paste from 
another thread; please disregard whatever looks weird.)



> On Apr 27, 2022, at 5:50 PM, Brian Goetz <brian.go...@oracle.com> wrote:
> 
> Let me try and put some more color on the bike shed (but, again, let’s focus 
> on model, not syntax, for now.)  
> 
> We have two axes of variation we want to express with non-identity classes: 
> atomicity constraints, and whether there is an additional zero-default 
> companion type.  These can be mostly orthogonal; you can have either, 
> neither, or both.  We've been previously assuming that "primitiveness" lumps 
> this all together; primitives get more flattening, primitives can be 
> non-nullable/zero-default, primitives means the good name goes to the "val" 
> type.  Primitive-ness implicitly flips the "safety vs performance" priority, 
> which has been bothering us because primitives also code like a class.  So we 
> were trying to claw back some atomicity for primitives.  
> 
> But also, we're a little unhappy with B2 because B2 comes with _more_ 
> atomicity than is necessarily needed; a B2 with no invariants still gets less 
> flattening than a B3.  That's a little sad.  And also that it seems like a 
> gratuitous difference, which makes the user model more complicated.  So we’re 
> suggesting restacking towards: 
> 
> - Value classes are those without identity
> - Value classes can be atomic or non-atomic, the default is atomic (safe by 
> default)
> - Value classes can further opt into having a "val" projection (name TBD, val 
> is probably not it)
> - Val projections are non-nullable, zero-default — this is the only difference
> - Both the ref and val projections inherit the atomicity constraints of the 
> class, making atomicity mostly orthogonal to ref/val/zero/null
> 
> Example: classic B2
> 
>    value class B2a { } 
> 
> Because the default is atomic, we get the classic B2 semantics -- no 
> identity, but full final field safety guarantees.  VM has several strategies 
> for flattening in the heap: single-field classes always flattened (“full 
> flat”), multi-field classes can be flattened with "fat load and store" 
> heroics in the future (“low flat”), otherwise, indirection (“no flat”)
> 
> Example: non-atomic B2 
> 
>    non-atomic value class B2n { }
> 
> Here, the user has said "I have no atomicity rquirements."  A B2n is a loose 
> aggregation of fields that can be individually written and read (full B3-like 
> flattening), with maybe an extra boolean field to encode null (VM's choice 
> how to encode, could use slack pointer bits etc.)  
> 
> Example: atomic B3
> 
>    zero-capable value class B3a { }
> 
> This says I am declaring two types, B3a and B3a.zero.  (The syntax in this 
> quadrant sucks; need to find better.)  B3a is just like B2a above, because we 
> haven’t activated the zero capability at the use site.  
> B3a.zero/val/flat/whatever is non-nullable, zero-default, *but still has full 
> B2-classic atomicity*.  With the same set of flattening choices on the part 
> of the VM.  
> 
> Example: full primitive
> 
>    non-atomic zero-capable value class B3n { }
> 
> Here, B3n is like B2n, and B3n.zero is a full classic-B3 Q primitive with 
> full flattening.  
> 
> So:
> 
> - value-ness means "no identity, == means state equality"
> - You can add non-atomic to value-ness, meaning you give up state integrity
> - You can orthogonally add zero-capable to value-ness, meaning you get a 
> non-null, zero-happy companion, which inherits the atomic-ness
> 
> Some of the characteristics of this scheme:
> 
> - The default is atomicity / integrity FOR ALL BUCKETS (safe by default)
> - The default is nullability FOR ALL BUCKETS 
> - All unadorned type names are reference types / nullable 
> - All Val-adorned type names (X.val) are non-nullable (or .zero, or .whatever)
> - Atomicity is determined by declaration site, can’t be changed at use site
> 
> The main syntactic hole is finding the right spelling for "zeroable" / .val.  
> There is some chance we can get away with spelling it `T!`, though this has 
> risks.  
> 
> Spelling zero-happy as any form of “flat” is probably a bad idea, because B2 
> can still be flat.  
> 
> A possible spelling for “non-atomic” is “relaxed”:
> 
>    relaxed value class B3n { }
> 
> Boilerplate-measurers would point out that to get full flattening, you have 
> to say three things at the declaration site and one extra thing at the use 
> site:
> 
>     relaxed zero-happy value class Complex { }
>     …
>     Complex! c;
> 
> If you forget relaxed, you might get atomicity (but might not cost anything, 
> if the value is small.)  If you forget zero-happy, you can’t say `Complex!`, 
> you can only say Complex, and the compiler will remind you.  If you forget 
> the !, you maybe get some extra footprint for the null bit.  None of these 
> are too bad, but the verbosity police might want to issue a warning here.  
> 
> It is possible we might want to flip the declaration of zero-capable, where 
> classes with no good default can opt OUT of the zero companion, rather than 
> the the other way around:
> 
>     null-default value class LocalDate { } 
> 
> which says that LocalDate must use the nullable (LocalDate) form, not the 
> non-nullable (LocalDate.val/zero/bang) form.  
> 
> 
> On 4/22/2022 2:24 PM, Brian Goetz wrote:
> I think I have a restack of Dan's idea that feels like fewer buckets.  
> 
> We have two axes of variation we want to express with flattenable types: 
> atomicity constraints, and whether there is an additional zero-default 
> companion type.  
> 
> We've been assuming that "primitiveness" lumps this all together; primitives 
> get more flattening, primitives can be non-nullable/zero-default, primitives 
> means the good name goes to the "val" type.  Primitive-ness implicitly flips 
> the "safety vs performance" priority, which is bothering us because 
> primitives also code like a class.  So we're trying to claw back some 
> atomicity for primitives.  
> 
> But also, we're a little unhappy with B2 because B2 comes with _more_ 
> atomicity than is necessarily needed; a B2 with no invariants still gets less 
> flattening.  That's a little sad.  Let's restack the pieces (again).  
> 
> - Value classes are those without identity
> - Value classes can be atomic or non-atomic, the default is atomic (safe)
> - Value classes can further opt into having a "val" projection (name TBD, val 
> is probably not it)
> - Val projections are non-nullable, zero-default
> - Both the ref and val projections inherit the atomicity constraints of the 
> class, making atomicity mostly orthogonal to ref/val/zero/null
> 
> Example: classic B2
> 
>    value class B2 { } 
> 
> Because the default is atomic, we get the classic B2 semantics -- no 
> identity, but full final field safety guarantees.  VM has several strategies 
> for flattening in the heap: single-field classes always flattened, 
> multi-field classes can be flattened with "fat load and store" heroics in the 
> future, otherwise, indirection.  
> 
> Example: non-atomic B2 
> 
>    non-atomic value class B2a { }
> 
> Here, the user has said "I have no atomicity rquirements."  A B2a is a loose 
> aggregation of fields that can be individually written and read (full B3-like 
> flattening), with maybe an extra boolean field to encode null (VM's choice 
> how to encode.)  
> 
> Example: atomic B3
> 
>    zero-capable value class B3a { }
> 
> This says I am declaring two types, B3a and B3a.zero.  (These names suck; 
> need better ones.)  B3a is just like B2 above.  B3a.zero is non-nullable, 
> zero-default, *but still has full B2-classic atomicity*.  With the same set 
> of flattening choices.  
> 
> Example: full primitive
> 
>    non-atomic zero-capable value class B3b { }
> 
> Here, B3b is like B2a, and B3b.zero is a full classic-B3 Q primitive with 
> full flattening.  
> 
> 
> So the stacking is:
> 
> - value-ness means "no identity, == means state equality"
> - You can add non-atomic to value-ness, meaning you give up state integrity
> - You can orthogonally add zero-capable to value-ness, meaning you get a 
> non-null, zero-happy companion
> 
> This is starting to feel more honest....
> 
> 
> 
> 
> 
> On 4/19/2022 6:45 PM, Brian Goetz wrote:
> By choosing to modify the class, we are implicitly splitting into Buckets 3a 
> and 3n:
> 
> - B2 gives up identity
> - B3a further gives up nullity
> - B3n further gives up atomicity
> 
> Which opens us up to a new complaint: people didn't even like the B2/B3 split 
> ("why does there have to be two"), and now there are three.  
> 
> Given that atomic/non-atomic only work with primitive, maybe there's a way to 
> compress this further?
> 
> On 4/19/2022 6:25 PM, Dan Smith wrote:
> On Apr 19, 2022, at 2:49 PM, Brian Goetz <brian.go...@oracle.com>
> wrote:
> 
> So, what shall we do when the user says non-atomic, but the constructor 
> expresses a multi-field invariant?
> 
> Lint warning, if we can detect it and that warning is turned on.
> 
> 
> On Apr 19, 2022, at 3:22 PM, Brian Goetz <brian.go...@oracle.com>
> wrote:
> 
> Stepping back, what you're saying is that we manage atomicity among a subset 
> of fields by asking the user to arrange the related fields in a separate 
> class, and give that class extra atomicity.  If we wanted to express 
> ColoredDiagonalPoint, in this model we'd say something like:
> 
>    non-atomic primitive ColoredDiagonalPoint { 
>        private DiagonalPoint p;
>        private Color c;
> 
>        private atomic primitive DiagonalPoint {
>            private int x, y;
> 
>            DiagonalPoint(int x, int y) { 
>                if (x != y) throw;
>                ...
>            }
>        }
>    }
> 
> Right?
> 
> Yep. Good illustration of how just providing a class modifier gives 
> programmers significant fine-grained control.
> 
> 
> We exempt the single-field classes from having an opinion.  We could also 
> exempt primitive records with no constructor behavior.  
> 
> Yeah, but (1) hard to identify all assumed invariants—some might appear in 
> factories, etc., or informally in javadoc; and (2) even in a class with no 
> invariants, it's probably useful for the author to explicitly acknowledge 
> that they understand tearing risks.
> 
> 
> What it gives up (without either a change in programming model, or compiler 
> heroics), is the ability to correlate between user-written invariants and the 
> corresponding atomicity constraints, which could guide users away from 
> errors.  Right?
> 
> Right. Could still do that if we wanted, but my opinion is that it's too much 
> language surface for the scale of the problem. If we did have additional 
> construction constraints, I'd prefer that atomic primitives allow full 
> imperative construction logic & encapsulation.
> 
> This feels analogous to advanced typing analyses that might prove certain 
> casts to be safe/unsafe. Sure, the language could try to be helpful by 
> implementing that analysis, but it would add lots of complexity, and 
> ultimately it's either a best-effort check or annoyingly restrictive.
> 
>> On Apr 27, 2022, at 2:51 PM, Dan Heidinga <heidi...@redhat.com> wrote:
>> 
>> I'm trying to understand how this refactoring fits the VM physics.
>> 
>> In particular, __non-atomic & __zero-ok fit together at the VM level
>> because the VM's natural state for non-atomic (flattened) data is zero
>> filled.  When those two items are decoupled, I'm unclear on what the
>> VM would offer in that case.  Thoughts?
>> 
>> How does "__non-atomic __non-id class B2a { }" fit with the "no new
>> nulls" requirements?
>> 
>> --Dan
>> 
>> On Wed, Apr 27, 2022 at 12:45 PM Brian Goetz <brian.go...@oracle.com> wrote:
>>> 
>>> Here’s some considerations for stacking the user model.  (Again, please 
>>> let’s resist the temptation to jump to the answer and then defend it.)
>>> 
>>> We have a stacking today which says:
>>> 
>>> - B1 is ordinary identity classes, giving rise to a single reference type
>>> - B2 are identity-free classes, giving rise to a single reference type
>>> - B3 are flattenable identity-free classes, giving rise to both a reference 
>>> (L/ref) and primitive (Q/val) type.
>>> 
>>> This stacking has some pleasant aspects.  B2 differs from B1 by “only one 
>>> bit”: identity.  The constraints on B2 are those that come from the lack of 
>>> identity (mutability, extensibility, locking, etc.)  B2 references behave 
>>> like the object references we are familiar with; nullability, final field 
>>> guarantees, etc.  B3 further makes reference-ness optional; reference-free 
>>> B3 values give up the affordances of references: they are zero-default and 
>>> tearable.  This stacking is nice because it can framed as a sequence of 
>>> “give up some X, get some Y”.
>>> 
>>> People keep asking “do we need B2, or could we get away with B1/B3”.  The 
>>> main reason for having this distinction is that some id-free classes have 
>>> no sensible default, and so want to use null as their default.  This is a 
>>> declaration-site property; B3 means that the zero value is reasonable, and 
>>> use sites can opt into / out of  zero-default / nullity.  We’d love to 
>>> compress away this bucket but forcing a zero on classes that can’t give it 
>>> a reasonable interpretation is problematic.  But perhaps we can reduce the 
>>> visibility of this in the model.
>>> 
>>> The degrees of freedom we could conceivably offer are
>>> 
>>>   { identity or not, zero-capable or not, atomic or not } x { use-site, 
>>> declaration-site }
>>> 
>>> In actuality, not all of these boxes make sense (disavowing the identity of 
>>> an ArrayList at the use site), and some have been disallowed by the 
>>> stacking (some characteristics have been lumped.)  Here’s another way to 
>>> stack the declaration:
>>> 
>>> - Some classes can disavow identity
>>> - Identity-free classes can further opt into zero-default (currently, B3, 
>>> polarity chosen at use site)
>>> - Identity-free classes can further opt into tearability (currently, B3, 
>>> polarity chosen at use site)
>>> 
>>> It might seem the sensible move here is to further split B3 into B3a and 
>>> B3b (where all B3 support zero default, and a/b differ with regard to 
>>> whether immediate values are tearable).  But that may not be the ideal 
>>> stacking, because we want good flattening for B2 (and B3.ref) also.  
>>> Ideally, the difference between B2 and B3.val is nullity only (Kevin’s 
>>> antennae just went up.)
>>> 
>>> So another possible restacking is to say that atomicity is something that 
>>> has to be *opted out of* at the declaration site (and maybe also at the use 
>>> site.)  With deliberately-wrong syntax:
>>> 
>>>   __non-id class B2 { }
>>> 
>>>   __non-atomic __non-id class B2a { }
>>> 
>>>   __zero-ok __non-id  class B3 { }
>>> 
>>>   __non-atomic __zero-ok  __non-id class B3a { }
>>> 
>>> In this model, you can opt out of identity, and then you can further opt 
>>> out of atomicity and/or null-default.  This “pulls up” the 
>>> atomicity/tearaiblity to a property of the class (I’d prefer safe by 
>>> default, with opt out), and makes zero-*capability* an opt-in property of 
>>> the class.  Then for those that have opted into zero-capability, at the use 
>>> site, you can select .ref (null) / .val (zero).  Obviously these all need 
>>> better spellings.  This model frames specific capabilities as modifiers on 
>>> the main bucket, so it could be considered either a two bucket, or a four 
>>> bucket model, depending on how you look.
>>> 
>>> The author is in the best place to make the atomicity decision, since they 
>>> know the integrity constraints.  Single field classes, or classes with only 
>>> single field invariants (denominator != 0), do not need atomicity.  Classes 
>>> with multi-field invariants do.
>>> 
>>> This differs from the previous stacking in that it moves the spotlight from 
>>> _references_ and their properties, to the properties themselves.  It says 
>>> to class writers: you should declare the ways in which you are willing to 
>>> trade safety for performance; you can opt out of the requirement for 
>>> references and nulls (saving some footprint) and atomicity (faster access). 
>>>  It says to class *users*, you can pick the combination of characteristics, 
>>> allowed by the author, that meet your needs (can always choose null default 
>>> if you want, just use a ref.)
>>> 
>>> There are many choices here about “what are the defaults”.  More opting in 
>>> at the declaration site might mean less need to opt in at the use site.  Or 
>>> not.
>>> 
>>> (We are now in the stage which I call “shake the box”; we’ve named all the 
>>> moving parts, and now we’re looking for the lowest-energy state we can get 
>>> them into.)
>>> 
>> 
>

Re: [External] : Re: User model stacking

Reply via email to