Do we even need IO/VO interfaces? (was: JEP update: Value Objects)

Brian Goetz Mon, 20 Dec 2021 11:26:32 -0800

I thought we were wrapping this up; I'm not sure how we got back to "dowe even need these at all", but OK. Splitting off a separate (hopefullyshort) thread.

These interfaces serve both a dynamic and static role. Statically, theyallow us to constrain inputs, such as:


    void runWithLock(IdentityObject lock, Runnable task)

and similar use in generic type bounds.

Dynamically, they allow code to check before doing something partial:

    if (x instanceof IdentityObject) { synchronized(x) { ... } }

rather than trying and dealing with IMSE.

Introducing new interfaces that have no methods is clearly source- andbinary compatible, so I am not particularly compelled by "some verybrittle and badly written code might break." So far, no one hasproposed any examples that would make us reconsider that.

As to "value class" vs "primitive class" vs "built in primitive", I seeno reason to add *additional* mechanisms by which to distinguish thesein either the static or dynamic type systems; the salient difference isidentity vs value. (Reflection will almost certainly give us means toask questions about how the class was declared, though.)

As to B3: instanceof operates on reference types, so (at least from apure spec / model perspective), `x instanceof T` gets answered on valueinstances by lifting to the reference type, and answering the questionthere. So it would not even be a sensible question to ask "are you aprimitive value vs primitive reference"; subtyping is a "referenceaffordance", and questions about subtyping are answered in the referencedomain.

And to B4: the goal is to make B3 and B4 as similar as possible; thereare going to be obvious ways in which we can't do this, but this shouldnot be relevant to either the static or dynamic type system.




On 12/20/2021 2:05 PM, Remi Forax wrote:

Brian,

the last time we talked about IdentityObject and ValueObject, you saidthat you were aware that introducing those interfaces will break someexisting codes,

but you wanted to know if it was a lot of codes or not.

So i do not understand now why you want to mixIdentityObject/ValueObject with the runtime behavior, it seems riskyand if we need to backout the introduction of those interfaces, itwill more work than it should.Decoupling the typing part and the runtime behavior seems a bettersolution.

Moreover, the split between IdentityObject and ValueObject makes lesssense now that we have 3 kinds of value objects, the identitylessreference (B2), the primitive (B3) and the builtin primitive (B4).Why do we want these types to be seen in the type system but not byexample the set containing only B3 and B4 ?


Rémi

------------------------------------------------------------------------

    *From: *"Brian Goetz" <[email protected]>
    *To: *"daniel smith" <[email protected]>, "Dan Heidinga"
    <[email protected]>
    *Cc: *"John Rose" <[email protected]>,
    "valhalla-spec-experts" <[email protected]>
    *Sent: *Lundi 20 Décembre 2021 18:54:01
    *Subject: *Re: JEP update: Value Objects

    I was working on some docs and am not sure if we came to a
    conclusion on the rules about who may, may not, or must declare
    ValueObject or IdentityObject.

    Let me see if I can chart the boundaries of the design space. 
    I'll start with IdentityObject since it is more constrained.

     - Clearly for legacy classes, the VM is going to have to infer
    and inject IdentityObject.
     - Since IdentityObject is an interface, it is inherited; if my
    super implements IO, so am I.
     - It seems desirable that a user be *allowed* to name
    IdentityObject as a superinterface of an interface or abstract
    class, which constrains what subclasses can do.  (Alternately we
    could spell this "value interface" or "value abstract class"; this
    is a separate set of tradeoffs.)
     - There is value in having exactly one way to say certain things;
    it reduces the space of what has to be specified and tested.
     - I believe our goal is to know everything we need to know at
    class load time, and not to have to go back and do complex checks
    on a supertype when a subclass is loaded.

    The choice space seems to be
      user { must, may, may not } specify IO on concrete classes
      x compiler { must, may, may not } specify IO when ACC_VALUE present
      x VM (and reflection) { mops up }

    where "mopping up" minimally includes dealing with legacy classfiles.

    Asking the user to say "IdentityObject" on each identity class
    seems ridiculous, so we can drop that one.

      user { may, may not } specify IO on concrete classes
      x compiler { must, may, may not } specify IO when ACC_VALUE present
      x VM (and reflection) { mops up }

    From a user model perspective, it seems arbitrary to say the user
    may not explicitly say IO for concrete classes, but may so do for
    abstract classes.  So the two consistent user choices are either:

     - User can say "implements IO" anywhere they like
     - User cannot say "implements IO" anywhere, and instead we have
    an "identity" modifier which is optional on concrete classes and
    acts as a constraint on abstract classes/interfaces.

    While having an "identity" modifier is nice from a completeness
    perspective, the fact that it is probably erased to "implements
    IdentityObject" creates complication for reflection (and another
    asymmetry between reflection and javax.lang.model).  So it seems
    that just letting users say "implements IdentityObject" is
    reasonable.

    Given that the user has a choice, there is little value in
    "compiler may not inject", so the choice for the compiler here is
    "must" vs "may" inject.  Which is really asking whether we want to
    draw the VM line at legacy vs new classfiles, or merely adding IO
    as a default when nothing else has been selected.  Note that
    asking the compiler to inject based on ACC_VALUE is also asking
    pretty much everything that touches bytecode to do this too, and
    likely to generate more errors from bytecode manglers.  The VM is
    doing inference either way, what we get to choose here is the axis.

    Let's put a pin in IO and come back to VO.

    The user is already saying "value", and we're stuck with the
    default being "identity".  Unless we want to have the user say
    "value interface" for a value-only interface (which moves some
    complexity into reflection, but is also a consistent model), I
    think we're stuck with letting the user specify either IO/VO on an
    abstract class / interface, which sort of drags us towards letting
    the user say it (redundantly) on concrete classes too.

    The compiler and VM will always type-check the consistency of the
    value keyword/bit and the implements clause.  So the real question
    is where the inference/injection happens.  And the VM will have to
    do injection for at least IO at least for legacy classes.

    So the choices for VM infer&inject seem to be:

     - Only inject IO for legacy concrete classes, based on classfile
    version, otherwise require everything to be explicit;
     - Inject IO for concrete classes when ACC_VALUE is not present,
    require VO to be explicit;
     - Inject IO for concrete classes when ACC_VALUE is not present;
    inject VO for concrete classes when ACC_VALUE is present

    Is infer&inject measurably more costly than just ordinary
    classfile checking?  It seems to me that if all things are equal,
    the simpler injection rule is preferable (the third), mostly on
    the basis of what it asks of humans who write code to manipulate
    bytecode, but if there's a real cost to the injection, then having
    the compiler help out is reasonable.  (But in that case, it
    probably makes sense for the compiler to help out in all cases,
    not just VO.)



    On 12/2/2021 6:11 PM, Dan Smith wrote:

            On Dec 2, 2021, at 1:04 PM, Dan Heidinga<[email protected]>  
wrote:

            On Thu, Dec 2, 2021 at 10:05 AM Dan Smith<[email protected]>  
wrote:

                On Dec 2, 2021, at 7:08 AM, Dan Heidinga<[email protected]>  
wrote:

                When converting back from our internal form to a classfile for 
the
                JVMTI RetransformClasses agents, I need to either filter the 
interface
                out if we injected it or not if it was already there.  JVMTI's
                GetImplementedInterfaces call has a similar issue with being
                consistent - and that's really the same issue as reflection.

                There's a lot of small places that can easily become 
inconsistent -
                and therefore a lot of places that need to be checked - to hide
                injected interfaces.  The easiest solution to that is to avoid
                injecting interfaces in cases where javac can do it for us so 
the VM
                has a consistent view.


                I think you may be envisioning extra complexity that isn't 
needed here. The plan of record is that we *won't* hide injected interfaces.

            +1.  I'm 100% on board with this approach.  It cleans up a lot of 
the
            potential corner cases.

                Our hope is that the implicit/explicit distinction is 
meaningless—that turning implicit into explicit via JVMTI would be a 100% 
equivalent change. I don't know JVMTI well, so I'm not sure if there's some 
reason to think that wouldn't be acceptable...

            JVMTI's "GetImplementedInterfaces" spec will need some adaptation as
            it currently states "Return the direct super-interfaces of this 
class.
            For a class, this function returns the interfaces declared in its
            implements clause."

            The ClassFileLoadHook (CFLH) runs either with the original bytecodes
            as passed to the VM (the first time) or with "morally equivalent"
            bytecodes recreated by the VM from its internal classfile formats.
            The first time through the process the agent may see a value class
            that doesn't have the VO interface directly listed while after a 
call
            to {retransform,redefine}Classes, the VO interface may be directly
            listed.  The same issues apply to the IO interface with legacy
            classfiles so with some minor spec updates, we can paper over that.

            Those are the only two places: GetImplementedInterfaces & CFLH and
            related redefine/retransform functions, I can find in the JVMTI spec
            that would be affected.  Some minor spec updates should be able to
            address both to ensure an inconsistency in the observed behaviour is
            treated as valid.

        Useful details, thanks.

        Would it be a problem if the ClassFileLoadHook gives different answers depending 
on the timing of the request (derived from original bytecodes vs. JVM-internal data)? If 
we need consistent answers, it may be that the "original bytecode" approach 
needs to reproduce the JVM's inference logic. If it's okay for the answers to change, 
there's less work to do.

        To highlight your last point: we *will* need to work this out for 
inferred IdentityObject, whether we decide to infer ValueObject or not.

Do we even need IO/VO interfaces? (was: JEP update: Value Objects)

Reply via email to