hi John,

thanks for replying...

After having read that, I think part of the problem actually comes from this new-invokespecial-super being split in two bytecodes. It means there can be a lot of things in between, including different paths. This makes the Verifier difficult.

The other part is that I need to react to runtime types. Currently this is only possible by using a generic handle, that will install the real target later on... With the problem, that the first call of the target is done from inside the generic handle, instead of the callsite. In terms of object creation, this means I will have access to the object, and in case of super-init-calls it would mean me having access to a not fully initialized class and potentially doing bad things here.

And that is even though I don't even need a handle that returns something. But since there is no real connection between slot 0 of the constructor I am in and the generic handle

But I wonder if there is really no way around that. Let me construct something crazy here... What would be if we had a dummy object instead? Let us call it GenericInstance for now. Generic Instance is internally connected to the partially generated class, but has no fields or methods offering access to it. The only way to create a GenericInstance would be by a factory method, from the indy API, like findSpecialConstructor or such. I would define the signature that it returns the GenericInstance. The handle itself is supposed to realize a new-"transform arguments"-invokeSpecial kind of sequence. The Verifier thus needs to acknowledge it to do that. And there needs to be code, that takes the result of the GenericInstance and then places the real instance in variable slot 0.

Since it is a two fold mechanism I cannot programatically do anything with the GenericInstance object, but to reach it through. Only the part unwrapping it can access the real instance (and also check the class to be sure) and that would be VM code.

I think this way splitting the method or have a constructor equivalent is not required... but I am not sure something like GenericInstance can be done. In pure Java probably not

bye jochen

Am 29.08.2015 03:40, schrieb John Rose:
The invokespecial-super-init dance is the thing MH's can't quite do, the "super" call 
every constructor (except Object.<init>).

It very hard to secure this pattern; just ask anybody who has worked on 
(de-)serialization security.

But, we can look at it from a more limited point of view which might improve 
your use case, Jochen.

A method handle is supposed to be a fully competent replacement for hardwired bytecodes, and it is, 
except for invokespecial-super from a constructor.  The reason this is hard is that there is no way to 
constrain such a method handle, once constructed, to operate inside a constructor.  And superclasses 
have a right to expect that you are running their constructor as a unique, non-repeatable part of 
creating a subclass object.  (By "have a right" I really mean "it would be wrong to do 
the unexpected" by which I also mean "attack surfaces are likely to open up if we do this.)

So, is there a way to package up a method handle so that it can only be used as as unique, 
non-repeatable part of creating a subclass object?  Yes, it can:  Wire in an unconditional 
"new instance" operation, and immediately run the "invokespecial super" on the 
new thing.

Now the problem reduces to:  Your class (just like its super) has a right to 
expect that constructor code will be run on every newly-created instance (after 
the super constructor), before the new object is made available to other code.  
Can we package up the previous new-invokespecial-super method handle so it can 
only be used in this way?  Well, no, since every constructor *also* has a 
hardwired call to invokespecial; we are back to the pre-existing 
new-invokespecial type of MH.

There are several possible ways out, but the problem is delicate.  The purpose of constructors is to 
statically mark code that must be executed before any (normally published) reference to an object is 
reachable by non-class code.  If there were a way to statically mark code as "post-super-init" 
("<postsuperinit>"?), we could make an agreement with a class that such a method would 
serve as the equivalent of a constructor, but it would be the caller's responsibility to allocate the new 
instance *and* call the super init.  Allowing bytecode to call this stuff would require a bunch of new 
verifier rules, in a place where the verifier is already hard to understand.  Perhaps a method handle 
could be allowed to operate where normal bytecode cannot, but you see the problem:  Method handles are 
designed to give a dynamic alternative to things you can already do in bytecode.

The "post-super-init" convention can be a private convention within a class, in the special case of 
Groovy, since Groovy is responsible for generating the whole class, and can trust itself to invoke all 
necessary initialization code on each new instance.  So if you had an new-invokespecial-super MH in a private 
context within a Groovy-generated class, you could use it to create a "mostly blank" instance, and 
then fill it in before sharing it with anybody else.  Such an invokespecial-super MH could be adequately 
protected from other users by requiring that "Lookup.findSpecialConstructor" can only work on 
full-powered lookups, regardless of the accessibility of the super constructor.

There are two further problems with this, though.  First, constructors have a unique 
ability and obligation to initialize blank final variables (the non-static ones).  So the 
Lookup.findSpecialConstructor MH has to take an argument, not just for its 
super-constructor, but also for *each* final variable in the *current* class.  (Note that 
Lookup.findSetter will *not* allow finals to be set, because it cannot prove that the 
caller is somehow "inside" a constructor, and, even if inside it, is trustably 
acting on behalf of it.)  There are other ways to go, but you can see this problem too:  
The new-invokespecial operator has to take responsibility for working with the caller to 
fill in the blank finals.

The second further problem is even more delicate.  The JVM enforces rules of calling <init> 
even (sometimes) against the wishes of people who generate class files.  We don't fully 
understand the practical effects of relaxing these rules.  Proofs of assertions (such as type 
correctness and security) require strong premises, and the rigid rules about <init> help 
provide such premises.  An example of a proof-failure would be somebody looking at a class, 
ensuring that all instances are secure based on the execution of <init> methods, but then 
fail to notice that the class *also* runs some instances through an alternate path, using 
new-invokespecial-super, which invalidates the proof by failing to run some crucial setup code.

With all that said, there is still wiggle room.  For example, one *possible* solution 
that might help Groovy, while being restrictive enough to avoid the problems above, 
would be to split <init> methods and sew them together again with method 

Suppose there were a reliable way to "split" an <init> method into two parts:  Everything up to the 
invokespecial-super-<init> call, and everything afterwards.  (Perhaps it must be preceded *only* by 
load-from-local opcodes.)  Call such <init> methods "splittable".  Not all will be splittable.  Then 
we could consider allowing a class to replace one of its splittable constructors by a new hybrid consisting of a 
differently-selected super-constructor, followed by the tail of the splittable constructor.  (Note that this neatly 
handles blank finals.)  It would not be valid for any party other than the sub-class itself to perform such a split, 
but it might, arguably, be reasonable for a class to do such a thing.

There are always many defects with such schemes.  In this case, there is no robust way to detect 
that a splittable constructor has in fact been split.  (I keep wanting to invent new bytecodes or 
verifier rules here!)  Any rule for splittability is going to be a little hacky, hence hard to 
understand and use correctly.  Specific constructors might be "coupled" strongly to 
matching super-constructors, in such a way that a mix-and-match will cause surprises, even to the 
author of the subclass.  (Having stuff happen by invisible magic gets old, as soon as you realize 
you have to vouch for the behavior of code which you can really only see in source form.)  Finally 
(as noted above) MHs are quite robustly understanable from the principle that they are "just 
another way" to do what bytecodes have already done; violating this principle pushes 
uncertainties into equivalence proofs about MHs and bytecodes.

In the end, I think Groovy may be better off using its ugly <init> bytecode 
sequence, where every subclass constructor calls (via a switch) every superclass 

I hope this helps, although it's kind of disappointing.  We ran into same 
dangerous dance, in the Valhalla bytecode interpreter, and had to fake it from 
random bits of the MH runtime.

— John

On Feb 26, 2015, at 2:27 AM, Jochen Theodorou <blackd...@gmx.org> wrote:

Am 26.02.2015 01:02, schrieb Charles Oliver Nutter:
After talking with folks at the Jfokus VM Summit, it seems like
there's a number of nice-to-have and a few need-to-have features we'd
like to see get into java.lang.invoke. Vladimir suggested I start a
thread on these features.

my biggest request: allow the call of a super constructor (like super(foo,bar)) 
using MethodHandles an have it understood by the JVM like a normal super 
constructor call... same for this(...)

Because what we currently do is annoying and a major pita, plus it bloats the 
bytecode we have to produce. And let us better not talk about speed or the that 
small verifier change that made our hack unusable in several java update 
versions for 7 and 8.

This has been denied in the past because of security reasons... And given that 
we need dynamic argument types to determine the constructor to be called, and 
since that we have to do a call from the runtime in the uncached case, I fully 
understand why this is not done... just... it would be nice to have a solution 
that does not require us doing basically a big switch table with several 
invokespecial calls

bye Jochen

Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org

mlvm-dev mailing list

Jochen "blackdrag" Theodorou
blog: http://blackdragsview.blogspot.com/

mlvm-dev mailing list

Reply via email to