Re: EXT: Re: series of switchpoints or better
On 05.10.2016 21:45, Charles Oliver Nutter wrote: On Wed, Oct 5, 2016 at 1:36 PM, Jochen Theodorou> wrote: If I hear Remi saying volatile read... then it does not sound free to me actually. In my experience volatile reads still present inlining barriers. But if Remi and all of you tell me it is still basically free, then I will not look too much at the volatile ;) The volatile read is only used in the interpreter. ah... I see.. nice. I get the feeling Remi actually already said this... In Groovy we use SwitchPoint as well, but only one for the whole meta class system that could clearly improved it seems. Having a Switchpoint per method is actually a very interesting approach I would not have considered before, since it means creating a ton of Switchpoint objects. Not sure if that works in practice for me since it is difficult to make a switchpoint for a method that does not exist in the super class, but may come into existence later on - still it seems I should be considering this. I suspect Groovy developers are also less likely to modify classes at runtime? In Ruby, it's not uncommon to keep creating new classes or modifying existing ones at runtime, though it is generally discouraged (all runtimes suffer). It depends a bit on the style if it is done more or less often. But I think the majority barely changes the classes. but compared to Ruby probably a lot less. We have a construct, that adds dynamically methods to multiple classes with a limited thread visibility and lifetime (Categories), but those are actually not realized as meta class changes. Creating a new class can happen any time, but they tend not to be build, they are declared with all the methods you want in there already usually. cold performance is a consideration for me as well though. The heavy creation time of MethodHandles is one of the reasons we do not use invokedynamic as much as we could... especially considering that creating a new cache entry via runtime class generation and still invoking the method via reflection is actually faster than producing one of our complex method handles right now. Creating a new cache entry via class generation? Can you elaborate on that? JRuby has a non-indy mode, but it doesn't do any code generation per call site. well, the code generation is optional, otherwise we use reflection in that mode. WE use the technique since I think 2008. And basically you have an interface call(Object[]), which we produce an implementation for at runtime and then call it. We use MagicAccessorImpl to avoid bytecode validation... well... if existing/accessible, not sure that is still the case in jdk9 though [...] Ahh, so when you invalidate, you only invalidate one class, but every call site would have a SwitchPoint for the target class and all of its superclasses. That will be more problematic for cold performance than JRuby's way, but less overhead when invalidating. I'm not which trade-off is better. have to test it out in the future. We also use this invalidation mechanism when calling dynamic methods from Java (since we also use call site caches there) but those sites are not (yet) guarded by a SwitchPoint. yes, we have a very few cases like this as well. [...] With recent improvements to MH boot time and cold performance, I've started to use indy by default in more places, carefully measuring startup overhead along the way. I'm well on my way toward having fully invokedynamic-aware jitted code basically be all invokedynamics. invokedynamic by default is the way to go ;) It is also good to hear that the old "once invalidated, it will not optimized again - ever" is no longer valid. And hopefully it will stay that way as long as we keep making noise :-) indeed ;) bye Jochen ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: EXT: Re: series of switchpoints or better
On Oct 5, 2016, at 12:45 PM, Charles Oliver Nutterwrote: > > It is also good to hear that the old "once invalidated, it will not optimized > again - ever" is no longer valid. > > And hopefully it will stay that way as long as we keep making noise :-) Go ahead, be that way!___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: EXT: Re: series of switchpoints or better
On Wed, Oct 5, 2016 at 1:36 PM, Jochen Theodorouwrote: > If I hear Remi saying volatile read... then it does not sound free to me > actually. In my experience volatile reads still present inlining barriers. > But if Remi and all of you tell me it is still basically free, then I will > not look too much at the volatile ;) > The volatile read is only used in the interpreter. In Groovy we use SwitchPoint as well, but only one for the whole meta class > system that could clearly improved it seems. Having a Switchpoint per > method is actually a very interesting approach I would not have considered > before, since it means creating a ton of Switchpoint objects. Not sure if > that works in practice for me since it is difficult to make a switchpoint > for a method that does not exist in the super class, but may come into > existence later on - still it seems I should be considering this. > I suspect Groovy developers are also less likely to modify classes at runtime? In Ruby, it's not uncommon to keep creating new classes or modifying existing ones at runtime, though it is generally discouraged (all runtimes suffer). > cold performance is a consideration for me as well though. The heavy > creation time of MethodHandles is one of the reasons we do not use > invokedynamic as much as we could... especially considering that creating a > new cache entry via runtime class generation and still invoking the method > via reflection is actually faster than producing one of our complex method > handles right now. > Creating a new cache entry via class generation? Can you elaborate on that? JRuby has a non-indy mode, but it doesn't do any code generation per call site. > As for Charles question: > >> Can you elaborate on the structure? JRuby has 6-deep (configurable) >> polymorphic caching, with each entry being a GWT (to check type) and a SP >> (to check modification) before hitting the plumbing for the method itself. >> > > right now we use a 1-deep cache with several GWT (check type and argument > types) and one SP plus several transformations. My goal is of course also > the 6-deep polymorphic caching in the end. Just motivation for this was not > so high before. If I use several SwitchPoint, then of course each of them > would be there for each cache entry. How many depends on the receiver type. > But at least one for each super class (and interface) > Ahh, so when you invalidate, you only invalidate one class, but every call site would have a SwitchPoint for the target class and all of its superclasses. That will be more problematic for cold performance than JRuby's way, but less overhead when invalidating. I'm not which trade-off is better. We also use this invalidation mechanism when calling dynamic methods from Java (since we also use call site caches there) but those sites are not (yet) guarded by a SwitchPoint. > To me horror I just found one pice of code commented with: > //TODO: remove this method if possible by switchpoint usage > With recent improvements to MH boot time and cold performance, I've started to use indy by default in more places, carefully measuring startup overhead along the way. I'm well on my way toward having fully invokedynamic-aware jitted code basically be all invokedynamics. > It is also good to hear that the old "once invalidated, it will not > optimized again - ever" is no longer valid. > And hopefully it will stay that way as long as we keep making noise :-) - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: EXT: Re: series of switchpoints or better
On 05.10.2016 18:21, MacGregor, Duncan (GE Energy Connections) wrote: I second (third?) Charlie and Remi’s comments. SwitchPoint per method has worked very nicely to reduce the amount of code invalidated by meta-programming shenanigans. You could go further and try for a class-and-method switch point, but that makes it harder to eliminate class checks or use CHA. The downside of all this kind of thing is that when stuff is invalidated it’s often fairly heavy weight, so it’s worth putting some thought into designing things to minimise the amount of code which will be invalidated when you flip a SwitchPoint and only invalidating things that really need it (that’s where a switch point per method often pays off). well you know... even if people tell you it is basically for free, it usually is not, which is why I wanted a confirmation. If I hear Remi saying volatile read... then it does not sound free to me actually. In my experience volatile reads still present inlining barriers. But if Remi and all of you tell me it is still basically free, then I will not look too much at the volatile ;) In Groovy we use SwitchPoint as well, but only one for the whole meta class system that could clearly improved it seems. Having a Switchpoint per method is actually a very interesting approach I would not have considered before, since it means creating a ton of Switchpoint objects. Not sure if that works in practice for me since it is difficult to make a switchpoint for a method that does not exist in the super class, but may come into existence later on - still it seems I should be considering this. cold performance is a consideration for me as well though. The heavy creation time of MethodHandles is one of the reasons we do not use invokedynamic as much as we could... especially considering that creating a new cache entry via runtime class generation and still invoking the method via reflection is actually faster than producing one of our complex method handles right now. As for Charles question: Can you elaborate on the structure? JRuby has 6-deep (configurable) polymorphic caching, with each entry being a GWT (to check type) and a SP (to check modification) before hitting the plumbing for the method itself. right now we use a 1-deep cache with several GWT (check type and argument types) and one SP plus several transformations. My goal is of course also the 6-deep polymorphic caching in the end. Just motivation for this was not so high before. If I use several SwitchPoint, then of course each of them would be there for each cache entry. How many depends on the receiver type. But at least one for each super class (and interface) To me horror I just found one pice of code commented with: //TODO: remove this method if possible by switchpoint usage which means we are currently using switchpoint as well as pinging?! Commit incoming ;) It is also good to hear that the old "once invalidated, it will not optimized again - ever" is no longer valid. thx a lot guys Jochen ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: EXT: Re: series of switchpoints or better
I second (third?) Charlie and Remi’s comments. SwitchPoint per method has worked very nicely to reduce the amount of code invalidated by meta-programming shenanigans. You could go further and try for a class-and-method switch point, but that makes it harder to eliminate class checks or use CHA. The downside of all this kind of thing is that when stuff is invalidated it’s often fairly heavy weight, so it’s worth putting some thought into designing things to minimise the amount of code which will be invalidated when you flip a SwitchPoint and only invalidating things that really need it (that’s where a switch point per method often pays off). Duncan. From: mlvm-dev <mlvm-dev-boun...@openjdk.java.net<mailto:mlvm-dev-boun...@openjdk.java.net>> on behalf of Charles Oliver Nutter <head...@headius.com<mailto:head...@headius.com>> Reply-To: Da Vinci Machine Project <mlvm-dev@openjdk.java.net<mailto:mlvm-dev@openjdk.java.net>> Date: Wednesday, 5 October 2016 at 15:00 To: Da Vinci Machine Project <mlvm-dev@openjdk.java.net<mailto:mlvm-dev@openjdk.java.net>> Subject: EXT: Re: series of switchpoints or better Hi Jochen! On Wed, Oct 5, 2016 at 7:37 AM, Jochen Theodorou <blackd...@gmx.org<mailto:blackd...@gmx.org>> wrote: If the meta class for A is changed, all handles operating on instances of A may have to reselect. the handles for B and Object need not to be affected. If the meta class for Object changes, I need to invalidate all the handles for A, B and Object. This is exactly how JRuby's type-modification guards work. We've used this technique since our first implementation of indy call sites. Doing this with switchpoints means probably one switchpoint per metaclass and a small number of meta classes per class (in total 3 in my example). This would mean my MethodHandle would have to get through a bunch of switchpoints, before it can do the actual method invocation. And while switchpoints might be fast it does not sound good to me. >From what I've seen, it's fine as far as hot performance. Adding complexity to >your handle chains likely impacts cold perf, of course. Can you elaborate on the structure? JRuby has 6-deep (configurable) polymorphic caching, with each entry being a GWT (to check type) and a SP (to check modification) before hitting the plumbing for the method itself. I will say that using SwitchPoints is FAR better than our alternative mechanism: pinging the (meta)class each time and checking a serial number. Or I can do one switchpoint for all methodhandles in the system, which makes me wonder if after a meta class change the callsite ever gets Jitted again. The later performance penalty is actually also not very attractive to me. We have fought to keep the JIT from giving up on us, and I believe that as of today you can invalidate call sites forever and the JIT will still recompile them (within memory, code cache, and other limits of course). However, you'll be invalidating every call site for every modification. If the system eventually settles, that's fine. If it doesn't, you're going to be stuck with cold call site performance most of the time. So what is the way to go here? Or is there an even better way? I strongly recommend the switchpoint-per-class granularity (or finer, like switchpoint-per-class-and-method-name, which I am playing with now). - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev