I've been working on similar issues trying to optimise something
heavily. I made a similar class to this one (I even had a similar
API) but I found I called it MostlyFinal instead.
private static final MostlyConstant<Integer> FOO = new MostlyConstant<>(42,
int.class);
private static final IntSupplier FOO_GETTER = FOO.intGetter();
By the way using a different interface than Supplier can give the JVM
more class hierarchy analysis info and so potentially allow for
inlining even without static final.
You can also simply use a closure in some cases sort of like this:
interface IntBox {
int get();
}
public IntBox makeBox(int x) {
return () -> x;
}
This is better for inlining because the JVM trusts final fields in VM
anonymous classes more than yours. Unfortunately
TrustStaticFinalFields cannot be a thing by default yet for backwards
compatibility reasons.
I think a lot of these things are pretty neat but unfortunately hard
to package in a generic and usable library because people delving into
these will want to tear into all the internal details for maximum
performance.
I don't really understand your StableField class. How is it supposed
to be any faster than MostlyConstant? I would suggest if you wanted
the best speed (in some ways and at a cost of memory) you could spin a
static final class with a method that returns a constantdynamic entry
and then return a methodhandle to that entry. This seems possibly
heavyweight IMO so I'm still thinking about this myself.
If StringSwitchCallSite being a MutableCallSite seems possibly
unneeded with a reworked API to me.
I am highly suspicious TypeSwitch will increase performance in most cases.
instanceof checks are highly optimized and give info that allow
further optimizations.
You might want to consider using/abusing the JVM's own inline caching
behaviour for interfaces for some dispatching.
It's not too hard to create a bag of interface implementations at
runtime that all dispatch to separate CallSite implementations which
can be faster than exactInvoker/your own MethodHandle lookup logic
sometimes. I considered this for a ThreadLocalCallSite class I was
making but I'm still not sure about the design.
So basically one hack to get quicker thread local behaviour is to
subclass Thread and add your own fields/methods like this:
((MyIface)Thread.currentThread()).doSpecific();
If you add your own bag of interface implementations then you can do
this dynamically:
MY_IFACE.invokeExact((BagThread)Thread.currentThread()).bag, ...);
I'm not sure about the bytecode generation here though. I don't want
to be too blase about that.
It looks like you have some benchmarks setup but I don't see any txt
files with any perf data listed. I mentioned a lot of gibberish
earlier but problem my biggest advice would be to add more benchmarks
and look at your benchmarks again and also get real world usage data.
--
You received this message because you are subscribed to the Google Groups
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.