Introduction
The invokedynamic instruction is not designed for use from Java. And yet there
are use cases for syntax support. Java is (among other things) the primary
systems programming language for the JVM. If you are programming a dynamic
language system, you are probably coding a lot of Java. (There could be rare
exceptions, of self-hosting languages which boot from bytecodes.) If you are
using JSR 292, your Java code probably works with method handles and
invokedynamic. In fact, you probably want to code an occasional invokedynamic
instruction directly by hand, and find yourself resorting to alternatives like
ASM or indify.
A year ago the 292 prototype included basic syntax support, but we yanked it
for various reasons. This note is my attempt to reserve a few neurons for
thinking about what syntax support might look like in some future release.
The old support from last year cannot be restored, if only because the
essential shape of an invokedynamic instruction has changed. The new shape
strongly affects any design for syntax support.
An invokedynamic call site has a reference to a CONSTANT_InvokeDynamic CP entry
(tag 18). This entry includes:
1. name: a non-interpreted Utf8 string
2. signature: a type descriptor (inferred from or corresponding to actual
arguments and return value)
3. bsm: a bootstrap method (expressed as a CONSTANT_MethodHandle)
4. bsmargs: a series of zero or more extra static arguments (arbitrary CP
entries)
The first two items are packed, as usual, in a CONSTANT_NameAndType. The other
items are stored in the BootstrapMethods class file attribute. (Both pairs of
items are referred to by index from the CONSTANT_InvokeDynamic CP entry. This
implies that they are shareable, which is an interesting property for some use
cases.)
Dynamic Call Site Syntax
In order to express all these degrees of freedom in Java syntax support, we
would need a syntax approximately like this:
Expression = IndyHead '.' IndyTail
IndyHead = 'invokedynamic' '(' BootstrapMethodEntry ')'
IndyTail = name:Identifier dynargs:MethodArguments
MethodArguments = '(' Expression ... ')'
BootstrapMethodEntry =
StaticMethodName bsmargs:[',' BSMConstantExpression]*
| 'new' StaticClassName bsmargs:[',' BSMConstantExpression]*
BSMConstantExpression =
IntegerLiteral | LongLiteral | FloatLiteral | DoubleLiteral
| ClassName '.' 'class'
| StringLiteral
| MethodHandleLiteral | MethodTypeLiteral
The fixed tokens (a keyword 'invokedynamic', parens, dots, commas, etc.) don't
matter as much as variable parts.
Each unique BootstrapMethodEntry corresponds to an element of the
BootstrapMethods classfile attribute.
Each distinct IndyTail corresponds to a CONSTANT_NameAndType entry in the
constant pool. (Actually, the return value type needs to be inferred also.
Let's just assume the same rules as for MethodHandle.invokeExact.)
An expression IndyHead.IndyTail would compile to code which pushes dynargs (as
with MethodHandle.invokeExact), plus a subsequent invokedynamic instruction
with the following structure:
CONSTANT_InvokeDynamic {
CONSTANT_NameAndType {
name
signature = (derived from dynargs and optional result cast)
}
BootstrapMethods[N] {
bsm = CONSTANT_MethodHandle { ...REF_invokeStatic or
REF_newInvokeSpecial... }
bsmrgs = { ...constant... }
}
}
This would give Java programmers a basic ability to code up invokedynamic
instructions. Examples:
String x = (String) invokedynamic(new MyCallSite, File.class).name1(false,
3.14);
invokedynamic(MyModule.myBSM, "argh", 42).name2("foo");
Bootstrap Method Abstraction
Note that the entirety of every bootstrap method entry (BSM plus optional
arguments), would have to be mentioned in every IndyHead expression. This is
OK for some uses, but gets old quickly. Luckily, there is this concept of
abstraction which allows the programmer to make abbreviations... What might it
look like in this case? For starters, it would be carried in a class of some
sort (since everything to do with API names is a class, interface, annotation,
or enum). But it would have to be a class member of a new sort, one which was
a prepackaged IndyHead. Let's call it a "bootstrap method declaration".
Here's a straw man:
ClassBodyDeclaration = BootstrapMethodDeclaration
BootstrapMethodDeclaration =
[Modifier]* 'invokedynamic' Identifier '(' BootstrapMethodEntry ')' ';'
A qualified name (or static import) could access the IndyHead as a bsmname:
IndyHead = QualifiedIdentifier
Any class or interface could have bootstrap method declarations mixed in. The
only valid use of such a declaration is as an IndyHead construct, meaning that
a use of the name must be followed immediately by a dot, a name and some
arguments.
For example:
public class MyIndyCarrier {
public static invokedynamic indy1(new MyCallSite, File.class);
private static invokedynamic indy2(MyModule.myBSM, "argh", 42);
public invokedynamic indy3(MyOtherModule.myOtherBSM);
}
String x = (String) MyIndyCarrier.indy1.name1(false, 3.14);
indy2.name2("foo");
MyIndyCarrier obj3 = ...;
obj3.indy3.name3("arg3"); // implicitly inserts obj3 before "arg3"
Bootstrap method declarations would be useful in static and non-static
versions. A non-static bootstrap method is selected from the type of a
receiver object. The receiver object itself becomes the first argument to the
invoke.
Name Control
All of the examples so far allow unlimited names (for the name in the
CONSTANT_NameAndType). Some variation might be useful which features fixed
names. Here's a crude first cut:
public class MyIndyCarrier {
public static invokedynamic indy4 = indy2.name4;
}
Then the first and second lines would be equivalent. But the third and fourth
lines would not be legal.
MyIndyCarrier.indy3.name4("foo");
MyIndyCarrier.indy4("foo");
MyIndyCarrier.indy4("foo").name4;
MyIndyCarrier.indy4("foo").notName4;
There are two things going on here: First is a way of binding a specific name
into a "canned" bootstrap method declaration. Second is a way of building one
BSM declaration (indy4) on top of a previous one (indy3).
Additional Static Arguments
All of the above examples of abstraction assume that all static parameters are
supplied at the point of declaration, not use. Realistically, some or all
might be specified at the point of use. Because the format of a
BootstrapMethods entry is a linear list of static arguments (after the BSM
itself), it's simple and useful to allow static arguments to accumulate. That
is, a bootstrap method declaration can define a BSM plus zero or more static
arguments. Then, a derived BSM declaration (if there is such a thing) can add
more static arguments. Finally, the use point can add a final set of static
arguments plus the name.
For example, a formatted printing facility might want to specify a format
string as a static argument to the bootstrap method. Example:
public class MyFormatter {
public invokedynamic using (MyFormatterBSM);
}
MyFormatter fmt = ...;
fmt.using("%d: %s").format(42, "the answer");
The big problem with the above rough cut is the noise word "format". There
would have to be a way to declare that no such name is expected. Perhaps:
public class MyFormatter {
public invokedynamic format (MyFormatterBSM, String fmt) . void;
}
In such a case, the use point has to include a mix of static BSM arguments and
dynamic arguments, and these have to be distinguished somehow.
So I sneaked in another idea here, of formal parameters for static BSM
arguments. (I.e., String fmt instead of some specific string "foo".) Thus,
the first few arguments to a use point must be static constants:
MyFormatter fmt = ...;
fmt.format("%d: %s", 42, "the answer");
Conclusion
The syntaxes above are intentionally weak looking. Again, my present point is
not to propose a syntax but rather to show what might be the relevant degrees
of freedom (for the end user) in syntax support for invokedynamic.
This is a back-burner project. However, I don't want it to slide all the way
off the stove and into the garbage.
-- John
_______________________________________________
mlvm-dev mailing list
[email protected]
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev