binding to indy with an annotation-driven reweaver

2016-07-21 Thread John Rose
Inspired in part by a recent exchange between Remi and Charlie[1],
I've been thinking recently, again, about binding Java APIs to indy.

[1]: https://groups.google.com/forum/#!topic/jvm-languages/IjIEzDc_d3U

I think I have a way to make it work, and (what is more)
I think the end result looks pretty good.  Even better,
along the way we can create a mechanism for naturally
constant-folding selected method calls (like List.of)
at link-time.  (Library defined constants ahead!)

There are two things which make all this hard.
First, indy calls its BSM at link time (for the particular
indy instruction, since the JVM is lazily linked),
but the Java language does not expose link-time operations,
except very indirectly (in  code, for example).

Second, many good indy use cases are at least partly
signature-polymorphic, just like method handles. 
But the Java language does not allow you to generify over
method type signatures (e.g., argument types of (), (int),
(int,int), (String), (String,int), etc., all in one API point).
Even Valhalla only lets you generify over one value at
a time.  (One step at a time!)

Note that these two hard points might occur at the same
time, since some interesting BSMs are in fact signature
polymorphic.  (Well, at least they are varargs methods,
and varargs is an OK substitute for S-P, at link time.)

OK, so we are reduced to baking special handling for those
sorts of things into the language (as MethodHandle and
VarHandle do), or passing some sort of smoke signal
through the language and reweaving the bytecodes
(as Remi is so good at).

How would we signal a BSM call, though Java, in a way
that a bytecode reweaver could recognize it?

Here's an answer, maybe the simplest answer:  Mark
some API points as BSMs (in disguise; don't let javac
know!).  When the reweaver encounters a BSM call in
bytecode, it has to collect all the operands and ensure
that they are constants.  If they are constants, then
the reweaver collects the whole call, and sticks it
into an auxiliary static method (or pattern-matches
the whole call into an indy bootstrap specifier, if
possible).  If they are not constants, it is a reweaving
error.

class IndyTrickster {
   enum StringIndexerKind { LENGTH, HASHCODE, CHAR0, PARSEINT };
   @IndyTricks.AtLinkTime
   static ToIntFunction indexString(StringIndexerKind k) { … }
}

class IndyTrickUser {
   int foo(String s) {
 return indexString(LENGTH).applyAsInt(s);  // returns s.length()
   }
   int bad(String s, StringIndexerKind k) {
 return indexString(k).applyAsInt(s);  // ERROR in reweaver?
   }
}

This is really just a mechanism for materializing link-time constants,
which all by itself is pretty interesting.

I've chosen enums here, because (a) they are constants, but
(b) they cannot be directly supported (today) as indy static
arguments.  So the reweaver has to dump some code in an
auxiliary method somewhere, at least to materialize the enum.

(And see JDK-8161256, "general data in constant pools", for
a better way forward.)

We can stop here and use this trick to create APIs that materialize
constant values of type List, Map, etc.  Put the AtLinkTime
annotation on List.of, for example, et voila.

This raises the question, what happens if an operand fails to be
a link-time constant?  Should the reweaver silently keep the
call as-is, so that it runs every time, instead of just once at
link time?  Probably yes, but what happens when somebody
is expecting link-time folding, and wants to hear if it fails?

One answer:  An optional argument to the AtLinkTime annotation,
to determine what to do if the folding fails (error/warn/allow).
The List.of guys would just silently allow the non-folding uses,
since that is what they are used for now.

It would seem that link-time constants can only be produced by
static methods, but that would be wrong:  You can have link-time
computations that call non-static methods also, as long as the
receiver of each such method is itself a link-time constant.

(All of this suggests that the Java language should just have
a first-class notion of link-time constant, as it already has
compile-time constants.  But as we all know, Java grows
slowly and deliberately.  Experimenting first outside the JLS,
such as suggested here with reweavers, is a great way to
add weight to the case for change.)

What's next?  Well, so far the thing returned from the
constant-producing BSM has a type fully determined
by the static declaration of the BSM method.

You could grab some S-P magic from MethodHandles
to follow the BSM call by an S-P call, like this:

class IndyTrickster {
   @IndyTricks.AtLinkTime
   static MethodHandle doubler(Class type) {
  return makeDoublerMH(type);
   }
   static MethodHandle makeDoublerMH(Class type) {
  … // returns (type x) -> (String) Arrays.asList(x, x).toString()
   }
}

class IndyTrickUser {
   String foo1(String s) {
 return (String) doubler(String.class).invoke(s);
   }
   String foo2(int n) {
 return 

Virtual Machine Meetup 2016 Program

2016-07-21 Thread Thomas Wuerthinger
The program of the 3rd Virtual Machine Meetup 2016 in Lugano (Switzerland) is 
now online available at http://vmmeetup.github.io/2016/ 
. Topics include the Shenandoah GC, the new 
Scala compiler Dotty, the Graal JIT compiler, execution of LLVM-based languages 
on the JVM, hardware acceleration techniques, and more.

The event is on the 1st and 2nd of September. Registration is free of charge 
and open until 27th of July.

Regards, thomas___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev