The invokespecial-super-init dance is the thing MH's can't quite do, the 
"super" call every constructor (except Object.<init>).

It very hard to secure this pattern; just ask anybody who has worked on 
(de-)serialization security.

But, we can look at it from a more limited point of view which might improve 
your use case, Jochen.

A method handle is supposed to be a fully competent replacement for hardwired 
bytecodes, and it is, except for invokespecial-super from a constructor.  The 
reason this is hard is that there is no way to constrain such a method handle, 
once constructed, to operate inside a constructor.  And superclasses have a 
right to expect that you are running their constructor as a unique, 
non-repeatable part of creating a subclass object.  (By "have a right" I really 
mean "it would be wrong to do the unexpected" by which I also mean "attack 
surfaces are likely to open up if we do this.)

So, is there a way to package up a method handle so that it can only be used as 
as unique, non-repeatable part of creating a subclass object?  Yes, it can:  
Wire in an unconditional "new instance" operation, and immediately run the 
"invokespecial super" on the new thing.

Now the problem reduces to:  Your class (just like its super) has a right to 
expect that constructor code will be run on every newly-created instance (after 
the super constructor), before the new object is made available to other code.  
Can we package up the previous new-invokespecial-super method handle so it can 
only be used in this way?  Well, no, since every constructor *also* has a 
hardwired call to invokespecial; we are back to the pre-existing 
new-invokespecial type of MH.

There are several possible ways out, but the problem is delicate.  The purpose 
of constructors is to statically mark code that must be executed before any 
(normally published) reference to an object is reachable by non-class code.  If 
there were a way to statically mark code as "post-super-init" 
("<postsuperinit>"?), we could make an agreement with a class that such a 
method would serve as the equivalent of a constructor, but it would be the 
caller's responsibility to allocate the new instance *and* call the super init. 
 Allowing bytecode to call this stuff would require a bunch of new verifier 
rules, in a place where the verifier is already hard to understand.  Perhaps a 
method handle could be allowed to operate where normal bytecode cannot, but you 
see the problem:  Method handles are designed to give a dynamic alternative to 
things you can already do in bytecode.

The "post-super-init" convention can be a private convention within a class, in 
the special case of Groovy, since Groovy is responsible for generating the 
whole class, and can trust itself to invoke all necessary initialization code 
on each new instance.  So if you had an new-invokespecial-super MH in a private 
context within a Groovy-generated class, you could use it to create a "mostly 
blank" instance, and then fill it in before sharing it with anybody else.  Such 
an invokespecial-super MH could be adequately protected from other users by 
requiring that "Lookup.findSpecialConstructor" can only work on full-powered 
lookups, regardless of the accessibility of the super constructor.

There are two further problems with this, though.  First, constructors have a 
unique ability and obligation to initialize blank final variables (the 
non-static ones).  So the Lookup.findSpecialConstructor MH has to take an 
argument, not just for its super-constructor, but also for *each* final 
variable in the *current* class.  (Note that Lookup.findSetter will *not* allow 
finals to be set, because it cannot prove that the caller is somehow "inside" a 
constructor, and, even if inside it, is trustably acting on behalf of it.)  
There are other ways to go, but you can see this problem too:  The 
new-invokespecial operator has to take responsibility for working with the 
caller to fill in the blank finals.

The second further problem is even more delicate.  The JVM enforces rules of 
calling <init> even (sometimes) against the wishes of people who generate class 
files.  We don't fully understand the practical effects of relaxing these 
rules.  Proofs of assertions (such as type correctness and security) require 
strong premises, and the rigid rules about <init> help provide such premises.  
An example of a proof-failure would be somebody looking at a class, ensuring 
that all instances are secure based on the execution of <init> methods, but 
then fail to notice that the class *also* runs some instances through an 
alternate path, using new-invokespecial-super, which invalidates the proof by 
failing to run some crucial setup code.

With all that said, there is still wiggle room.  For example, one *possible* 
solution that might help Groovy, while being restrictive enough to avoid the 
problems above, would be to split <init> methods and sew them together again 
with method handles.

Suppose there were a reliable way to "split" an <init> method into two parts:  
Everything up to the invokespecial-super-<init> call, and everything 
afterwards.  (Perhaps it must be preceded *only* by load-from-local opcodes.)  
Call such <init> methods "splittable".  Not all will be splittable.  Then we 
could consider allowing a class to replace one of its splittable constructors 
by a new hybrid consisting of a differently-selected super-constructor, 
followed by the tail of the splittable constructor.  (Note that this neatly 
handles blank finals.)  It would not be valid for any party other than the 
sub-class itself to perform such a split, but it might, arguably, be reasonable 
for a class to do such a thing.

There are always many defects with such schemes.  In this case, there is no 
robust way to detect that a splittable constructor has in fact been split.  (I 
keep wanting to invent new bytecodes or verifier rules here!)  Any rule for 
splittability is going to be a little hacky, hence hard to understand and use 
correctly.  Specific constructors might be "coupled" strongly to matching 
super-constructors, in such a way that a mix-and-match will cause surprises, 
even to the author of the subclass.  (Having stuff happen by invisible magic 
gets old, as soon as you realize you have to vouch for the behavior of code 
which you can really only see in source form.)  Finally (as noted above) MHs 
are quite robustly understanable from the principle that they are "just another 
way" to do what bytecodes have already done; violating this principle pushes 
uncertainties into equivalence proofs about MHs and bytecodes.

In the end, I think Groovy may be better off using its ugly <init> bytecode 
sequence, where every subclass constructor calls (via a switch) every 
superclass constructor.

I hope this helps, although it's kind of disappointing.  We ran into same 
dangerous dance, in the Valhalla bytecode interpreter, and had to fake it from 
random bits of the MH runtime.

— John

On Feb 26, 2015, at 2:27 AM, Jochen Theodorou <blackd...@gmx.org> wrote:
> 
> Am 26.02.2015 01:02, schrieb Charles Oliver Nutter:
>> After talking with folks at the Jfokus VM Summit, it seems like
>> there's a number of nice-to-have and a few need-to-have features we'd
>> like to see get into java.lang.invoke. Vladimir suggested I start a
>> thread on these features.
> 
> my biggest request: allow the call of a super constructor (like 
> super(foo,bar)) using MethodHandles an have it understood by the JVM like a 
> normal super constructor call... same for this(...)
> 
> Because what we currently do is annoying and a major pita, plus it bloats the 
> bytecode we have to produce. And let us better not talk about speed or the 
> that small verifier change that made our hack unusable in several java update 
> versions for 7 and 8.
> 
> This has been denied in the past because of security reasons... And given 
> that we need dynamic argument types to determine the constructor to be 
> called, and since that we have to do a call from the runtime in the uncached 
> case, I fully understand why this is not done... just... it would be nice to 
> have a solution that does not require us doing basically a big switch table 
> with several invokespecial calls
> 
> bye Jochen
> 
> -- 
> Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
> blog: http://blackdragsview.blogspot.com/
> german groovy discussion newsgroup: de.comp.lang.misc
> For Groovy programming sources visit http://groovy-lang.org
> 
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Reply via email to