On Fri, Jul 22, 2022 at 12:02 PM John Rose <[email protected]>
wrote:
On 22 Jul 2022, at 10:55, Brian Goetz wrote:
…
So then, would we call an instance of `Complex.val` a
"non-heap object" or an "inlined object" or what? We need
to flesh out a whole lexicon. The phrase "value object"
becomes useless for this particular distinction as it
will apply to both.
Yes, in the taxonomy I’m pushing, a “value object” is one
without identity, and is the kind of object you can store
directly in variables without going through a reference. But
I don’t think that there are instances of Complex.val and
instances of Complex.ref; I think there are instances of
*Complex*, and multiple ways to describe/store/access them.
FTR, I enthusiastically agree with this viewpoint, even though I
am also probing for weaknesses and alternatives. (FTR I feel the
same about Brian’s summary in his previous short message.)
And under this viewpoint, the terms “instance” and “object” have
the same denotation, though difference connotations. (When I say
“instance” you may well think, “instance of what”? But you don’t
ask that question so much if I say “object”.)
That `int/Integer` decision you've been making has
always been between (1) value and (2) (reference-to)
object, and that decision is still exactly between
(1) value and (2) (reference-to) object now, and btw
the definitions of 'reference' and 'object' remain
precisely wedded to each other as always.
The "heap object" alternative strikes me (and I am trying
to be fair, here) as:
Now, that's an object either way, and you're going to
apply that old thought process toward which *kind* of
object you mean, either a (1) "inline object" or a
(2) "(reference-to) heap object". It's now just heap
objects and references that are paired together.
I think, Kevin, you are going wrong at this point: It’s not a
/kind/ of object, it is a /placement/ of an object. What “kind”
of person am I when I am diving to the office? Surely the same
“kind” as when I am at home. But when I am driving, I am equipped
with a car and a road, much like a heap-placed object is equipped
with a header and references.
Likewise, an int/Integer is (in Valhalla) the same “kind” of
object (if we go all the way to making primitives be honorary
objects) whether it is placed in heap or on stack or inside
another object.
The distinction that comes from the choice of equipping an int
with a header in heap storage is a distinction of placement (and
corresponding representation). So an int/Integer does not
intrinsically have a header because it is an object (because of
its “kind”). It /may/ have a header if the JVM needs to give it
one, because it is stuck in the heap.
(My points about int/Integer could partly fail if we fail to
align int and Integer in the end. So transfer the argument to
C.val/C.ref if you prefer. It is the same argument.)
And I would say the /placement/ of an object is in three broad
cases which are worth teaching even to beginners:
*
“in the heap”: therefore referred to by a machine word
address, and presumably equipped with a header and maybe
surrounded by some alignment waste; a JVM might have multiple
heaps but at this level of discourse we say “the heap”
*
“on the stack”: therefore manipulated directly by its
components, which are effectively separated into scalars (it
is “scalarized”, we sometimes say); we might sometimes wish
to say “JVM stack or locals” instead of “stack”, or, with
increasing detail, “on stack, in locals, and/or in registers,
and/or as immediates in the machine code”
*
“contained in another object”: in a field or array element,
therefore piggy-backing on the other object’s placement; and
note that even arrays are scalarized sometimes, lifting their
elements into registers etc.
To summarize: |Placement = Heap | Stack | Contained[Placement]|.
One might use the term “inline” somewhere in there, either to
mean |Contained| or |Stack|Contained[*]|.
Static field values are a special case, but they can be
classified in one of the above ways. HotSpot places static fields
inside a special per-class object (the mirror, in fact), so their
values are either contained or separate in the heap (JVM’s choice
again).
One might be pedantic and say that an instance can be contained
“in static memory” (neither heap nor stack) if the JVM implements
storage for static fields outside of the heap. But in that case
I’d rather say that they are in a funny corner of the heap, where
perhaps headers are not needed, because some static metadata
somewhere dictates what is stored.
(Hence I like to be cagey about whether a heap-object actually
has a physical header. It might not in some JVM implementations.)
Starting to prefer the first way (as I did) did not feel
like going rogue: after all, did we not gravitate toward
".ref" and ".val" as our placeholder syntaxes, not
".inline" and ".heap" or anything else?
With you on this. I think asking users to reason about “heap
objects” vs “inline objects” is pushing them towards the
implementation, not the concepts. They may have to reason
about this to understand the performance model, but that’s
already advanced material.
Yes. And even more specifically in the implementation, users who
think about “heap objects” are really (IMO) trying to predict the
/placement/ of the objects, /where/ the JVM will choose to place
their bits in physical memory.
This question of placement is very interesting to the “alert”
performance-minded programmer. Not every programmer is in that
state; for me I try to practice “first make it work then make it
fast”. I get “alert” to performance only in the “make it fast
phase”, a phase which many of my codes never reach.
As a sort of “siren song” the question of placement is /also/
interesting to the beginning student who is struggling to build a
mental image of Java data, and is reaching for visualizations in
terms of memory and addresses, or (what is about the same) boxes
and arrows. But the JVM will make a hash of all that, if it is
doing a good job. So the student must be told to hold those
mental models lightly.
Kevin is insisting (for his own good reasons) on his answer to
“where are the objects”: They are always “in the heap” and thus
“with headers, accessed by pointers”. I suspect (but haven’t seen
from Kevin himself yet) that this is in part due to a desire to
work with, rather than work against, the student’s desire to make
simple visual models of Java data.
Crucially, in a literal “boxes and arrows” model, an arrow
(perhaps a |C.ref| reference to an instance) looks very different
from a nested box (perhaps a |C.val| instance), and the naive
user might insist that such differences are part of the contract
between the user and the JVM. But they are not. The JVM might
introduce invisible “arrows” (because of heap buffering) and it
might remove arrows (because of scalarization for a number of
possible reasons).
So if the student is told that the arrows and boxes are “what’s
really going on” the student using that assurance to predict
performance and footprint will feel cheated in the end.
To summarize: Any given instance/object has logically independent
properties of class and placement.
And thus: The choice of companion type does not affect class but
may (may!) affect placement.
Circling back to the language design, it might seem odd that
there are three ways to place an object but just two companion
types. But this oddness goes away if you realize that |C.val| and
|C.ref| are not placement directives. The choice between the two
is a net-binary selection from a sizeable menu of “affordances”
that the user might be expecting or disavowing at any given point
in the code. (See my lists of “affordances” and “alternative
affordances” in encapsulating-val
<http://cr.openjdk.java.net/~jrose/values/encapsulating-val.html#affordances-of-c.ref>.)
The user is given this simplified switch to influence the JVM’s
decisions about placement (and therefore representation). It is
useful because the JVM can employ different implementation
tactics depending on the differences between the user-visible
contracts of |C.ref| and of |C.val|. In the choice of
implementation tactics, the JVM has the final say.
--
Kevin Bourrillion | Java Librarian | Google, Inc. |[email protected]