Let me give a brief overview of where things are with respect to flattening, 
since some of this influences the user-model discussion Kevin has initiated.)  
This is a very rough sketch, and not written for a general audience, so if 
you’re tempted to post this to Twitter because it seems cool and 
curiosity-satisfying, while I can’t stop you, you’re probably anti-helping.)

Layout is always at the discretion of the JVM; that’s how we like it. There 
will be no directives for “forcing” any kind of layout, including flattening.  
The JVM always has the option of indirecting with a pointer.  Currently it 
always does this for object references, and never does this for primitives.  
For Bucket 1 classes, we will almost certainly continue to lay out an LBucket1 
as a pointer.  (Remember that layout of an object with an LFoo field often 
happens before Foo is loaded; flattening introduces an ordering edge into the 
class loading graph.)  

Most people think of flattening as being only flattening of heap layouts, but 
there is also flattening in the calling convention, and this can be a huge 
source of benefit.  Flattening in the calling convention means that rather than 
passing an aggregate to or from an out-of-line call via a pointer, we scalarize 
the value and pass the field values instead.   Calling convention is generally 
determined early in the run, so if we load the class after the calling 
convention is set, we may miss out on this.  

For a reference type (e.g., B2 classes, and B3.ref), we are constrained by two 
properties of reference-ness; the need to represent null, and the JMM 
constraint that loads and stores of references are atomic with respect to one 
another.  (This is where tear-freedom comes from.)  Nullity can be represented 
as some sort of footprint tax (inject a boolean, or reinterpret slack bits such 
as low order pointer bits in existing fields.)  Tearing is not relevant to 
stack (calling convention) flattening, so even L types can get flattening on 
the stack.  

I’ll pause because this is sort of amazing: an LB2, while a reference type, is, 
in the current implementation, routinely flattened in calling convention, using 
an extra synthetic field for null.  If you thought references were always 
indirections, you’ll be surprised.  Long chains of things like 
Optional.map(…).flatMap(…) are routinely allocation-free in C2-compiled code, 
even for out-of-line calls.  (The interpreter and C1 still use indirections on 
the stack and in locals.)

In the heap, this is where reference types (including B2) have some trouble.  
The atomicity requirement bites hard here.  References in the heap are 
routinely laid out as indirections.  Final references to id-free instances 
_could_ safely be flattened, but they are not yet.  Mutable references to 
id-free instances are problematic because of potential tearing.  We *could* 
(but do not yet, and its complicated) flatten 64 bit values by stuffing 
multiple 32 bit values into a single synthetic field or by storing/loading 
multiple fields with a single load (“fat loads”), and on platforms with fast 
128 bit atomics (which include some intel cores where the spec was recently 
revised to commit to atomicity), but the complexity cost here is high, and 
flattening would be limited by the instruction set.  This is under 
investigation but unlikely to be a magic bullet.  

In the heap, Q types (B3.val) can be fully flattened (though the VM will likely 
impose a threshold above which it uses indirections anyway, such as 512 bits.)  
Full flattening means not only the layout, but that we can access the fields of 
the nested object with narrow single-field loads and stores.  

Scorecard:
 - Identity-free reference (L) types can be flattened, within limits (which is 
amazing)
 - Identity-free reference types usually pay some footprint tax for the null 
channel
 - Identity-free reference types are routinely flattened on stack, and may get 
some more heap flattening in the future
 - Identity-free immediate types have no null channel, and can be fully 
flattened and accessed with narrow loads and stores, because they’re allowed to 
tear

To the extent we treat B2 and B3.ref the same way (which we want to),  any 
flattening wins for refs (e.g., final fields, fat access) will apply to both.  


Reply via email to