We can divide the VM flattening strategy into three rough categories (would you 
like some milk with your eclair?):

 - non-flat — use a pointer
 - full-flat — inline the layout into the enclosing container, access with 
narrow loads
 - low-flat — use some combination of atomic operations to cram multiple fields 
into 64 or 128 bits, access with wide loads

B1 will always take the non-flat strategy.  Non-volatile B3 that are smaller 
than some threshold (e.g., full cache line) will prefer the full-flat strategy. 
 Non-atomic B2 can also pursue the full-flat strategy, but may have an extra 
field for the null channel.  Atomic B2/B3 may try the low-flat strategy, and 
fall back to non-flat where necessary.  Volatiles will likely choose non-flat, 
unless they fit in the CAS window.  But it is always VM’s choice.  The user 
model may ask for nullability (represented however the VM wants, zero for 
non-flat, extra channel for low/full flat), and may ask for atomicity (which 
influences the layout choice too, likely dropping down a notch from full to low 
or low to non.)  So from a class file perspective, we need an extra bit 
(ACC_ATOMIC) which is acted on at layout time.  (B2 / B3.ref translated as L 
with Preload, B3.val as Q, as today.)

So I think this mostly fits into the existing VM, with the addition of an 
ATOMIC bit which will constrain which flattening strategy we take at layout 
time.  This doesn’t sound like a complex new trick, just another reason to fall 
back to a weaker flattening strategy for a given layout.


We originally split B2 out from B3 to support no-good-default values
(aka allow null), support atomicity and avoid tearing. Anything
missing in that list?


These are the moving parts.

B3s are
more akin to hint than a promise.

B2s too.  A conformant VM could use pointers for everything, though might have 
to do extra null checks in things like checkcast.

Many of the properties we want for B2 classes are possible because we
adopted references (L carriers).  If we shift towards guaranteed
atomicity for (some) B3.vals, we're going to need to re-examine the VM
model and look at how we represent these additional constraints so the
VM can enforce them.

Yes.  I believe this is limited to the moral equavlient of an ACC_ATOMIC bit, 
plus some extra steps in the layout / field access instruction selection.

The VM can provide some tearing-related guarantees for Qs without
indirection but they are hardware dependent - 64bit for sure on all
64bit hardware, 128bit on some newer intel hardware, possibly
different constraints on still other platforms - but maybe that's OK?
Declaring a type must not tear makes it harder for the VM to provide
better density.

Yes.  In this model, an atomic B3 is basically a B2 without the need for a null 
channel, so a slightly thinner B2.  When the user asks for more atomicity, they 
are constraining flattening (but not necessarily throwing it all out the 
windows, maybe low-flat will work for them.)  All the hardware-dependent stuff 
lives in the layout algorithm; whether the current processor can support the 
desired atomics may be used to select between {low, no, full}-flat.

The biggest concern I have with this approach is that instead of
having 3 buckets, we're now exposing more of a buffet of options to
users.  Circling back to where I started this email - good defaults
are critical and so is good guidance on when to pick each of the
options or performance cargo cults will undercut the work to split out
the different cases.

Yes.  This is a subjective question, whether three equally spaced buckets feels 
more complex than two buckets with some knobs on the second bucket.  We surely 
want to avoid overwhelming the user with too big a menu, but we already have 
one user (hi Remi!) clamoring for the full buffet, and he’s hungry.



On Apr 27, 2022, at 2:51 PM, Dan Heidinga 
<[email protected]<mailto:[email protected]>> wrote:

I'm trying to understand how this refactoring fits the VM physics.

In particular, __non-atomic & __zero-ok fit together at the VM level
because the VM's natural state for non-atomic (flattened) data is zero
filled.  When those two items are decoupled, I'm unclear on what the
VM would offer in that case.  Thoughts?

How does "__non-atomic __non-id class B2a { }" fit with the "no new
nulls" requirements?

--Dan

On Wed, Apr 27, 2022 at 12:45 PM Brian Goetz 
<[email protected]<mailto:[email protected]>> wrote:

Here’s some considerations for stacking the user model.  (Again, please let’s 
resist the temptation to jump to the answer and then defend it.)

We have a stacking today which says:

- B1 is ordinary identity classes, giving rise to a single reference type
- B2 are identity-free classes, giving rise to a single reference type
- B3 are flattenable identity-free classes, giving rise to both a reference 
(L/ref) and primitive (Q/val) type.

This stacking has some pleasant aspects.  B2 differs from B1 by “only one bit”: 
identity.  The constraints on B2 are those that come from the lack of identity 
(mutability, extensibility, locking, etc.)  B2 references behave like the 
object references we are familiar with; nullability, final field guarantees, 
etc.  B3 further makes reference-ness optional; reference-free B3 values give 
up the affordances of references: they are zero-default and tearable.  This 
stacking is nice because it can framed as a sequence of “give up some X, get 
some Y”.

People keep asking “do we need B2, or could we get away with B1/B3”.  The main 
reason for having this distinction is that some id-free classes have no 
sensible default, and so want to use null as their default.  This is a 
declaration-site property; B3 means that the zero value is reasonable, and use 
sites can opt into / out of  zero-default / nullity.  We’d love to compress 
away this bucket but forcing a zero on classes that can’t give it a reasonable 
interpretation is problematic.  But perhaps we can reduce the visibility of 
this in the model.

The degrees of freedom we could conceivably offer are

   { identity or not, zero-capable or not, atomic or not } x { use-site, 
declaration-site }

In actuality, not all of these boxes make sense (disavowing the identity of an 
ArrayList at the use site), and some have been disallowed by the stacking (some 
characteristics have been lumped.)  Here’s another way to stack the declaration:

- Some classes can disavow identity
- Identity-free classes can further opt into zero-default (currently, B3, 
polarity chosen at use site)
- Identity-free classes can further opt into tearability (currently, B3, 
polarity chosen at use site)

It might seem the sensible move here is to further split B3 into B3a and B3b 
(where all B3 support zero default, and a/b differ with regard to whether 
immediate values are tearable).  But that may not be the ideal stacking, 
because we want good flattening for B2 (and B3.ref) also.  Ideally, the 
difference between B2 and B3.val is nullity only (Kevin’s antennae just went 
up.)

So another possible restacking is to say that atomicity is something that has 
to be *opted out of* at the declaration site (and maybe also at the use site.)  
With deliberately-wrong syntax:

   __non-id class B2 { }

   __non-atomic __non-id class B2a { }

   __zero-ok __non-id  class B3 { }

   __non-atomic __zero-ok  __non-id class B3a { }

In this model, you can opt out of identity, and then you can further opt out of 
atomicity and/or null-default.  This “pulls up” the atomicity/tearaiblity to a 
property of the class (I’d prefer safe by default, with opt out), and makes 
zero-*capability* an opt-in property of the class.  Then for those that have 
opted into zero-capability, at the use site, you can select .ref (null) / .val 
(zero).  Obviously these all need better spellings.  This model frames specific 
capabilities as modifiers on the main bucket, so it could be considered either 
a two bucket, or a four bucket model, depending on how you look.

The author is in the best place to make the atomicity decision, since they know 
the integrity constraints.  Single field classes, or classes with only single 
field invariants (denominator != 0), do not need atomicity.  Classes with 
multi-field invariants do.

This differs from the previous stacking in that it moves the spotlight from 
_references_ and their properties, to the properties themselves.  It says to 
class writers: you should declare the ways in which you are willing to trade 
safety for performance; you can opt out of the requirement for references and 
nulls (saving some footprint) and atomicity (faster access).  It says to class 
*users*, you can pick the combination of characteristics, allowed by the 
author, that meet your needs (can always choose null default if you want, just 
use a ref.)

There are many choices here about “what are the defaults”.  More opting in at 
the declaration site might mean less need to opt in at the use site.  Or not.

(We are now in the stage which I call “shake the box”; we’ve named all the 
moving parts, and now we’re looking for the lowest-energy state we can get them 
into.)



Reply via email to