Re: [External] : Re: VM model and aconst_init

forax Mon, 31 Jan 2022 10:05:15 -0800

> From: "John Rose" <john.r.r...@oracle.com>
> To: "Remi Forax" <fo...@univ-mlv.fr>
> Cc: "valhalla-spec-experts" <valhalla-spec-experts@openjdk.java.net>
> Sent: Thursday, January 27, 2022 10:00:20 PM
> Subject: Re: [External] : Re: VM model and aconst_init


> On 25 Jan 2022, at 2:50, [ mailto:fo...@univ-mlv.fr | fo...@univ-mlv.fr ] 
> wrote:

>> Let's talk a bit about having the L world and the Q world completely 
>> disjoint at
>> least from the bytecode verifier POV.
> Yes please; indeed, that is my strong recommendation. Doing this ( Q>:<L
> perhaps, the “squinting cat operator”?) allows the interpreter (as well as
> verifier) to concentrate one “mode” at a time, for values on stack. This in
> turn unlocks implementation possibilities which are not as practical when Q<:L
> (our previous plan of record).

> But keeping Q and L disjoint is a relative notion. Clearly they have to
> interoperate somewhere . And I believe that “somewhere”, where Q’s can become
> L’s and vice versa, is in the dynamic behavior of bytecodes, not in the type
> system (in the verifier) which relates multiple bytecodes.

>> It means that we need some checkcasts to move in both direction, from a 
>> Q-type
>> to a L-type and vice-versa.
> Correct. Checkcast is the mode-change operator (both directions). No need to
> create a new bytecode for either direction.

>> But at the same time, an array of L-type is a subtype of an array of Q-type ?
> That depends on what you mean by “subtype”. The JVM has multiple type systems.
> In the verifier we might have a choice of allowing Q[]<:L[] or not. It’s 
> better
> not. That puts no limitations on what the aaload and aastore instructions do.
> They just dynamically sense whether the array is flattened or not and DTRT.
> That has nothing to do with verifier types. In fact, it’s a separated job of
> the runtime type system.

> For the runtime type system, we will say that arrays are covariant across an
> extends which is an augmentation of the subtype system. So for the runtime
> T[]<:U[] iff T extends U , so Q[]<:L[] (for the same class in both modes). But
> in the verifier Q[]>:<L[] . Don’t like it? Sorry, that’s the best we can do,
> given Java’s history with arrays and the weakness of the verifier.

It did not occur to me that the verifier type system and the runtime type 
system can be so divorced. 

Currently, the verifier just does not do some verifications (related to 
interfaces) that are done later at runtime. But you propose to do the opposite, 
something that is not legal for the verifier but legal at runtime. 
I don't say that this is not possible to do that, but this idea has a huge cost 
in term of engineering because it requires to think in terms of two different 
mental models, so every discrepancies can be a source of bug. 
Also, what does it means for java.lang.invoke which tries to follow the 
bytecode semantics but at runtime ? 

[...] 

>> Furthermore, i believe that subtyping is a key to avoid multiple bytecode
>> verification of the generics code.
> I recommend a far simpler technique: Just don’t. Don’t multiple-verify, and 
> and
> don’t do anything that would need such extra verification (with different
> types) to give the same answer (as the first verification). Also, don’t think
> of a JVM-loaded set of bytecodes as every having more than one operational
> meaning.

I agree, no multiple-verify, but like the VM has to do some runtime checks akin 
when there is a checkcast with an interface, the VM has to do something when 
there is a TypeRestriction. 

[...] 

> But don’t confuse the implementation of generics from the semantic model . The
> implementation will (in our plan of record) use something the verifier knows
> nothing about: Dynamic type restrictions (or something equivalent) which 
> manage
> the projections and embeddings of specific instance types into the generic
> bound type. These do not have to be (indeed must not be , see below) 
> identified
> as verifier types. They are associations of pairs of JVM runtime types.

>> By example, with the TypeRestriction attribute [1], the restriction has to be
>> subtype of the declared type/descriptor.
> No, just no. (Not your fault; I suppose the TR docs are in a confusing state.)
> There’s no reason to make that restriction (linking TRs to the verifier type
> system), other than the futile obsession I called out above. And if you do you
> make yourself sorry when you remember about the verifier’s I/J/F/D types. And
> for many other reasons, such as instance types which are inaccessible to the
> generic code. And the simple fact that bytecodes are not really
> generic-capable, directly, without serious (and futile) redesign.

> Making bytecodes directly generic (in the sense of contextually re-type-able) 
> is
> never going to be as simple as using a TR mechanism which occurs apart from 
> the
> verifier .

Yes, 

> (Plus, I don’t trust the verifier to take on new jobs. In its existing job it
> has been an ever-flowing fountain of security bugs. It made a hash of jsr/ret,
> and continually messed up instance creation paths. I won’t re-hire it for any
> new jobs. It’s good for matching descriptors in symbolic references, and maybe
> for helping the interpreter make a minor optimization to the getfield and
> putfield bytecodes, but that’s about it. The v-table index caching for
> invokevirtual looks cool but is effectively a no-op. The JIT makes all of the
> above irrelevant to performance, once the JIT kicks in. So the verifier gives
> an incremental push to startup performance, or interpreter-only performance if
> anyone still cares. Plus it gives us extra “security” except when it doesn’t.
> OK, I’ll stop with the verifier now.)

It also gives meaningful (mostly) error messages when my students try to 
generate bytecodes :) 

> A “type system” for TRs is pretty meaningless, from the JVM POV. In the JVM, a
> TR (despite its name) is most fundamentally an operational specification of 
> how
> to project (G->I) or embed (I->G) instance types into generic types. It’s a
> function pair. It can connect I=int to G=Object , which the JVM knows are
> disjoint types. A TR is very useful for implementing source-level type
> relations that the JVM doesn’t know directly about, such as int extends Object
> ( int<|Object maybe?). So talking about subtypes in TRs is nonsense as far as 
> I
> can see, unless in the special sense that “we use TRs in a translation
> strategy, to help extend the JVM’s limited subtyping notions to our language’s
> more expansive subtyping notions”. If TRs were a language-level construct,
> then, yeah, maybe we say a TR has some constraints about language-level
> subtypes. But it’s a JVM-only construct that plugs in where the JVM’s type
> system doesn’t work well enough.

First, i don't think you need int extends Object, because you can have 
Qjava/lang/Integer; which already extends Object, ArrayList<int> being another 
spelling of ArrayList<Qjava/lang/Integer;> which is great because it's a 
subtype (at runtme) of ArrayList<Integer>. 

TR is not a silver bullet, TR has only two paths, one using the declared type 
and one using the restricted type, when there is inheritance, unlike bridges 
that can generate several paths (at the cost of several methods/entry points), 
with TR you are stuck with two, the fully optimized and the un-optimized, 
anything in between will go to the un-optimized path. Furthermore, any codes 
with a wildcard will go through the un-optimized path. So the runtime checks 
associated with TRs as to be fast. Hence the idea to restrict them to subtyping 
checks. 

> (Should we rename TR to PE, “Project-Embedding Pair”? I like TR, but maybe 
> only
> because I’m used to the term. As a JVM-level concept, it doesn’t need to get a
> highly polished name. The audience for it is limited.)

Rémi

Re: [External] : Re: VM model and aconst_init

Reply via email to