Re: JEP update: Value Objects

Brian Goetz Mon, 20 Dec 2021 09:54:19 -0800

I was working on some docs and am not sure if we came to a conclusion onthe rules about who may, may not, or must declare ValueObject orIdentityObject.

Let me see if I can chart the boundaries of the design space. I'll startwith IdentityObject since it is more constrained.

- Clearly for legacy classes, the VM is going to have to infer andinject IdentityObject. - Since IdentityObject is an interface, it is inherited; if my superimplements IO, so am I. - It seems desirable that a user be *allowed* to name IdentityObjectas a superinterface of an interface or abstract class, which constrainswhat subclasses can do. (Alternately we could spell this "valueinterface" or "value abstract class"; this is a separate set of tradeoffs.) - There is value in having exactly one way to say certain things; itreduces the space of what has to be specified and tested. - I believe our goal is to know everything we need to know at classload time, and not to have to go back and do complex checks on asupertype when a subclass is loaded.


The choice space seems to be
  user { must, may, may not } specify IO on concrete classes
  x compiler { must, may, may not } specify IO when ACC_VALUE present
  x VM (and reflection) { mops up }

where "mopping up" minimally includes dealing with legacy classfiles.

Asking the user to say "IdentityObject" on each identity class seemsridiculous, so we can drop that one.


  user { may, may not } specify IO on concrete classes
  x compiler { must, may, may not } specify IO when ACC_VALUE present
  x VM (and reflection) { mops up }

From a user model perspective, it seems arbitrary to say the user maynot explicitly say IO for concrete classes, but may so do for abstractclasses. So the two consistent user choices are either:


 - User can say "implements IO" anywhere they like

- User cannot say "implements IO" anywhere, and instead we have an"identity" modifier which is optional on concrete classes and acts as aconstraint on abstract classes/interfaces.

While having an "identity" modifier is nice from a completenessperspective, the fact that it is probably erased to "implementsIdentityObject" creates complication for reflection (and anotherasymmetry between reflection and javax.lang.model). So it seems thatjust letting users say "implements IdentityObject" is reasonable.

Given that the user has a choice, there is little value in "compiler maynot inject", so the choice for the compiler here is "must" vs "may"inject. Which is really asking whether we want to draw the VM line atlegacy vs new classfiles, or merely adding IO as a default when nothingelse has been selected. Note that asking the compiler to inject based onACC_VALUE is also asking pretty much everything that touches bytecode todo this too, and likely to generate more errors from bytecode manglers. The VM is doing inference either way, what we get to choose here is theaxis.


Let's put a pin in IO and come back to VO.

The user is already saying "value", and we're stuck with the defaultbeing "identity". Unless we want to have the user say "value interface"for a value-only interface (which moves some complexity into reflection,but is also a consistent model), I think we're stuck with letting theuser specify either IO/VO on an abstract class / interface, which sortof drags us towards letting the user say it (redundantly) on concreteclasses too.

The compiler and VM will always type-check the consistency of the valuekeyword/bit and the implements clause. So the real question is wherethe inference/injection happens. And the VM will have to do injectionfor at least IO at least for legacy classes.


So the choices for VM infer&inject seem to be:

- Only inject IO for legacy concrete classes, based on classfileversion, otherwise require everything to be explicit; - Inject IO for concrete classes when ACC_VALUE is not present,require VO to be explicit; - Inject IO for concrete classes when ACC_VALUE is not present; injectVO for concrete classes when ACC_VALUE is present

Is infer&inject measurably more costly than just ordinary classfilechecking? It seems to me that if all things are equal, the simplerinjection rule is preferable (the third), mostly on the basis of what itasks of humans who write code to manipulate bytecode, but if there's areal cost to the injection, then having the compiler help out isreasonable. (But in that case, it probably makes sense for the compilerto help out in all cases, not just VO.)




On 12/2/2021 6:11 PM, Dan Smith wrote:

On Dec 2, 2021, at 1:04 PM, Dan Heidinga<heidi...@redhat.com>  wrote:

On Thu, Dec 2, 2021 at 10:05 AM Dan Smith<daniel.sm...@oracle.com>  wrote:

On Dec 2, 2021, at 7:08 AM, Dan Heidinga<heidi...@redhat.com>  wrote:

When converting back from our internal form to a classfile for the
JVMTI RetransformClasses agents, I need to either filter the interface
out if we injected it or not if it was already there.  JVMTI's
GetImplementedInterfaces call has a similar issue with being
consistent - and that's really the same issue as reflection.

There's a lot of small places that can easily become inconsistent -
and therefore a lot of places that need to be checked - to hide
injected interfaces.  The easiest solution to that is to avoid
injecting interfaces in cases where javac can do it for us so the VM
has a consistent view.


I think you may be envisioning extra complexity that isn't needed here. The 
plan of record is that we *won't* hide injected interfaces.

+1.  I'm 100% on board with this approach.  It cleans up a lot of the
potential corner cases.

Our hope is that the implicit/explicit distinction is meaningless—that turning 
implicit into explicit via JVMTI would be a 100% equivalent change. I don't 
know JVMTI well, so I'm not sure if there's some reason to think that wouldn't 
be acceptable...

JVMTI's "GetImplementedInterfaces" spec will need some adaptation as
it currently states "Return the direct super-interfaces of this class.
For a class, this function returns the interfaces declared in its
implements clause."

The ClassFileLoadHook (CFLH) runs either with the original bytecodes
as passed to the VM (the first time) or with "morally equivalent"
bytecodes recreated by the VM from its internal classfile formats.
The first time through the process the agent may see a value class
that doesn't have the VO interface directly listed while after a call
to {retransform,redefine}Classes, the VO interface may be directly
listed.  The same issues apply to the IO interface with legacy
classfiles so with some minor spec updates, we can paper over that.

Those are the only two places: GetImplementedInterfaces & CFLH and
related redefine/retransform functions, I can find in the JVMTI spec
that would be affected.  Some minor spec updates should be able to
address both to ensure an inconsistency in the observed behaviour is
treated as valid.

Useful details, thanks.

Would it be a problem if the ClassFileLoadHook gives different answers depending on the 
timing of the request (derived from original bytecodes vs. JVM-internal data)? If we need 
consistent answers, it may be that the "original bytecode" approach needs to 
reproduce the JVM's inference logic. If it's okay for the answers to change, there's less 
work to do.

To highlight your last point: we *will* need to work this out for inferred 
IdentityObject, whether we decide to infer ValueObject or not.

Re: JEP update: Value Objects

Reply via email to