Re: Getting back into indy...need a better argument collector!
First attempt at a workaround seems to be a wash. I rolled back to my older logic (that does not use a hand-crafted collector method) to come up with a pure-MethodHandle workaround for asCollector. I came up with this (using InvokeBinder): ``` MethodHandle constructArray = Binder.from(arrayType, Object[].class) .fold(MethodHandles.arrayLength(Object[].class)) .dropLast() .newArray(); MethodHandle transmuteArray = Binder.from(arrayType, Object[].class) .fold(constructArray) .appendInts(0, 0, count) .permute(1, 2, 0, 3, 4) .cast(ARRAYCOPY.type().changeReturnType(arrayType)) .fold(ARRAYCOPY) .permute(2) .cast(arrayType, arrayType) .identity(); MethodHandle collector = transmuteArray.asCollector(Object[].class, count).asType(source.dropParameterTypes(0, index).changeReturnType(arrayType)); return MethodHandles.collectArguments(target, index, collector); ``` Hopefully this is mostly readable. Basically I craft a chain of handles that uses the normal Object[] collector and then simulates what the pre-Jorn asCollector does: allocate the actual array we want and arraycopy everything over. I figured this would be worth a try since Jorn's comments on the PR hinted at the intermediate Object[] going away for some collect forms. Unfortunately, reproducing the old asCollector using MethodHandles does not appear to work any better... or at least it still pales compared to a collector function. I am open to suggestions because my next attempt will probably be to chain a series of folds together that populate the target array directly, but it will be array.length deep. Not ideal and not a good general solution. On Thu, Apr 1, 2021 at 6:44 PM Charles Oliver Nutter wrote: > > Very nice! I will have a look at the pull request and perhaps it will lead me > to a short-term work around as well. > > On Thu, Apr 1, 2021, 12:04 Jorn Vernee wrote: >> >> Hi Charlie, >> >> (Sorry for replying out of line like this, but I'm not currently >> subscribed to the mlvm-dev mailing list, so I could not reply to your >> earlier email thread directly.) >> >> I have fixed the performance issue with asCollector you reported [1], >> and with the patch the performance should be the same/similar for any >> array type (as well as fixing a related issue with collectors that take >> more than 10 arguments). The patch is out for review here: >> https://github.com/openjdk/jdk/pull/3306 >> >> Cheers, >> Jorn >> >> [1] : https://bugs.openjdk.java.net/browse/JDK-8264288 >> ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net https://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Getting back into indy...need a better argument collector!
Very nice! I will have a look at the pull request and perhaps it will lead me to a short-term work around as well. On Thu, Apr 1, 2021, 12:04 Jorn Vernee wrote: > Hi Charlie, > > (Sorry for replying out of line like this, but I'm not currently > subscribed to the mlvm-dev mailing list, so I could not reply to your > earlier email thread directly.) > > I have fixed the performance issue with asCollector you reported [1], > and with the patch the performance should be the same/similar for any > array type (as well as fixing a related issue with collectors that take > more than 10 arguments). The patch is out for review here: > https://github.com/openjdk/jdk/pull/3306 > > Cheers, > Jorn > > [1] : https://bugs.openjdk.java.net/browse/JDK-8264288 > > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net https://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Getting back into indy...need a better argument collector!
Thanks Paul! I am moving forward with my JRuby PRs but if I can help in any way let me know. I am especially interested in whether there might be some workaround rather than having to write my own custom argument boxing collectors. Will try to poke around at other combinations of handles and see what I can come up with. - Charlie On Fri, Mar 26, 2021 at 11:20 AM Paul Sandoz wrote: > > Hi Charlie, > > Thanks for the details. I quickly logged: > > https://bugs.openjdk.java.net/browse/JDK-8264288 > > I don’t have time to dive into the details right now. Perhaps next week, or > hopefully someone else can. > > Paul. > > > On Mar 25, 2021, at 9:25 PM, Charles Oliver Nutter > > wrote: > > > > JRuby branch with changes to use our own collector methods: > > https://github.com/jruby/jruby/pull/6630 > > > > InvokeBinder 1.2 added collect(index, type, collector) that calls > > MethodHandles.collectArguments: > > https://github.com/headius/invokebinder/commit/9650de07715c6e15a8ca4029c40ea5ede9d5c4c9 > > > > A build of JRuby from the branch (or from jruby-9.2 branch or master > > once it is merged) compared with JRuby 9.2.16.0 should show the issue. > > Benchmark included in the PR above. > > > > On Thu, Mar 25, 2021 at 8:43 PM Charles Oliver Nutter > > wrote: > >> > >> After experimenting with MethodHandles.collectArguments (given a > >> hand-written collector function) versus my own logic (using folds and > >> permutes to call my collector), I can confirm that both are roughly > >> equivalent and better than MethodHandle.asCollector. > >> > >> The benchmark linked below calls a lightweight core Ruby method > >> (Array#dig) that only accepts an IRubyObject[] (so all arities must > >> box). The performance of collectArguments is substantially better than > >> asCollector. > >> > >> https://gist.github.com/headius/28343b8c393e76c717314af57089848d > >> > >> I do not believe this should be so. The logic for asCollector should > >> be able to gather up Object subtypes into an Object[] subtype without > >> an intermediate array or extra copying. > >> > >> On Thu, Mar 25, 2021 at 7:39 PM Charles Oliver Nutter > >> wrote: > >>> > >>> Well it only took me five years to circle back to this but I can > >>> confirm it is just as bad now as it ever was. And it is definitely due > >>> to collecting a single type. > >>> > >>> I will provide whatever folks need to investigate but it is pretty > >>> straightforward. When asking for asCollector of a non-Object[] type, > >>> the implementation will first gather arguments into an Object[], and > >>> then create a copy of that array as the correct type. So two arrays > >>> are created, values are copied twice. > >>> > >>> I can see this quite clearly in the assembly after letting things > >>> optimize. A new Object[] is created and populated, and then a second > >>> array of the correct type is created followed by an arraycopy > >>> operation. > >>> > >>> I am once again backing off using asCollector directly to instead > >>> provide my own array-construction collector. > >>> > >>> Should be easy to reproduce the perf issues simply by doing an > >>> asCollector that results in some subtype of Object[]. > >>> > >>> On Thu, Jan 14, 2016 at 8:18 PM Charles Oliver Nutter > >>> wrote: > >>>> > >>>> Thanks Duncan. I will try to look under the covers this evening. > >>>> > >>>> - Charlie (mobile) > >>>> > >>>> On Jan 14, 2016 14:39, "MacGregor, Duncan (GE Energy Management)" > >>>> wrote: > >>>>> > >>>>> On 11/01/2016, 11:27, "mlvm-dev on behalf of MacGregor, Duncan (GE > >>>>> Energy > >>>>> Management)" >>>>> duncan.macgre...@ge.com> wrote: > >>>>> > >>>>>> On 11/01/2016, 03:16, "mlvm-dev on behalf of Charles Oliver Nutter" > >>>>>> > >>>>>> wrote: > >>>>>> ... > >>>>>>> With asCollector: 16-17s per iteration > >>>>>>> > >>>>>>> With hand-written array construction: 7-8s per iteration > >>>>>>> > >>>>>>> A sampling profile only shows my Ruby code as the
Re: Getting back into indy...need a better argument collector!
JRuby branch with changes to use our own collector methods: https://github.com/jruby/jruby/pull/6630 InvokeBinder 1.2 added collect(index, type, collector) that calls MethodHandles.collectArguments: https://github.com/headius/invokebinder/commit/9650de07715c6e15a8ca4029c40ea5ede9d5c4c9 A build of JRuby from the branch (or from jruby-9.2 branch or master once it is merged) compared with JRuby 9.2.16.0 should show the issue. Benchmark included in the PR above. On Thu, Mar 25, 2021 at 8:43 PM Charles Oliver Nutter wrote: > > After experimenting with MethodHandles.collectArguments (given a > hand-written collector function) versus my own logic (using folds and > permutes to call my collector), I can confirm that both are roughly > equivalent and better than MethodHandle.asCollector. > > The benchmark linked below calls a lightweight core Ruby method > (Array#dig) that only accepts an IRubyObject[] (so all arities must > box). The performance of collectArguments is substantially better than > asCollector. > > https://gist.github.com/headius/28343b8c393e76c717314af57089848d > > I do not believe this should be so. The logic for asCollector should > be able to gather up Object subtypes into an Object[] subtype without > an intermediate array or extra copying. > > On Thu, Mar 25, 2021 at 7:39 PM Charles Oliver Nutter > wrote: > > > > Well it only took me five years to circle back to this but I can > > confirm it is just as bad now as it ever was. And it is definitely due > > to collecting a single type. > > > > I will provide whatever folks need to investigate but it is pretty > > straightforward. When asking for asCollector of a non-Object[] type, > > the implementation will first gather arguments into an Object[], and > > then create a copy of that array as the correct type. So two arrays > > are created, values are copied twice. > > > > I can see this quite clearly in the assembly after letting things > > optimize. A new Object[] is created and populated, and then a second > > array of the correct type is created followed by an arraycopy > > operation. > > > > I am once again backing off using asCollector directly to instead > > provide my own array-construction collector. > > > > Should be easy to reproduce the perf issues simply by doing an > > asCollector that results in some subtype of Object[]. > > > > On Thu, Jan 14, 2016 at 8:18 PM Charles Oliver Nutter > > wrote: > > > > > > Thanks Duncan. I will try to look under the covers this evening. > > > > > > - Charlie (mobile) > > > > > > On Jan 14, 2016 14:39, "MacGregor, Duncan (GE Energy Management)" > > > wrote: > > >> > > >> On 11/01/2016, 11:27, "mlvm-dev on behalf of MacGregor, Duncan (GE Energy > > >> Management)" > >> duncan.macgre...@ge.com> wrote: > > >> > > >> >On 11/01/2016, 03:16, "mlvm-dev on behalf of Charles Oliver Nutter" > > >> > > > >> >wrote: > > >> >... > > >> >>With asCollector: 16-17s per iteration > > >> >> > > >> >>With hand-written array construction: 7-8s per iteration > > >> >> > > >> >>A sampling profile only shows my Ruby code as the top items, and an > > >> >>allocation trace shows Object[] as the number one object being > > >> >>created...not IRubyObject[]. Could that be the reason it's slower? > > >> >>Some type trickery messing with optimization? > > >> >> > > >> >>This is very unfortunate because there's no other general-purpose way > > >> >>to collect arguments in a handle chain. > > >> > > > >> >I haven¹t done any comparative benchmarks in that area for a while, but > > >> >collecting a single argument is a pretty common pattern in the Magik > > >> >code, > > >> >and I had not seen any substantial difference when we last touched that > > >> >area. However we are collecting to plain Object[] so it might be that is > > >> >the reason for the difference. If I¹ve got time later this week I¹ll do > > >> >some experimenting and check what the current situation is. > > >> > > >> Okay, I’ve now had a chance to try this in with our language benchmarks > > >> and can’t see any significant difference between a hand crafted method > > >> and > > >> asCOllector, but we are dealing with Object and Object[], so it might be > > >> something to do with additional casting. > > >> > > >> Duncan. > > >> > > >> ___ > > >> mlvm-dev mailing list > > >> mlvm-dev@openjdk.java.net > > >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net https://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Getting back into indy...need a better argument collector!
After experimenting with MethodHandles.collectArguments (given a hand-written collector function) versus my own logic (using folds and permutes to call my collector), I can confirm that both are roughly equivalent and better than MethodHandle.asCollector. The benchmark linked below calls a lightweight core Ruby method (Array#dig) that only accepts an IRubyObject[] (so all arities must box). The performance of collectArguments is substantially better than asCollector. https://gist.github.com/headius/28343b8c393e76c717314af57089848d I do not believe this should be so. The logic for asCollector should be able to gather up Object subtypes into an Object[] subtype without an intermediate array or extra copying. On Thu, Mar 25, 2021 at 7:39 PM Charles Oliver Nutter wrote: > > Well it only took me five years to circle back to this but I can > confirm it is just as bad now as it ever was. And it is definitely due > to collecting a single type. > > I will provide whatever folks need to investigate but it is pretty > straightforward. When asking for asCollector of a non-Object[] type, > the implementation will first gather arguments into an Object[], and > then create a copy of that array as the correct type. So two arrays > are created, values are copied twice. > > I can see this quite clearly in the assembly after letting things > optimize. A new Object[] is created and populated, and then a second > array of the correct type is created followed by an arraycopy > operation. > > I am once again backing off using asCollector directly to instead > provide my own array-construction collector. > > Should be easy to reproduce the perf issues simply by doing an > asCollector that results in some subtype of Object[]. > > On Thu, Jan 14, 2016 at 8:18 PM Charles Oliver Nutter > wrote: > > > > Thanks Duncan. I will try to look under the covers this evening. > > > > - Charlie (mobile) > > > > On Jan 14, 2016 14:39, "MacGregor, Duncan (GE Energy Management)" > > wrote: > >> > >> On 11/01/2016, 11:27, "mlvm-dev on behalf of MacGregor, Duncan (GE Energy > >> Management)" >> duncan.macgre...@ge.com> wrote: > >> > >> >On 11/01/2016, 03:16, "mlvm-dev on behalf of Charles Oliver Nutter" > >> > > >> >wrote: > >> >... > >> >>With asCollector: 16-17s per iteration > >> >> > >> >>With hand-written array construction: 7-8s per iteration > >> >> > >> >>A sampling profile only shows my Ruby code as the top items, and an > >> >>allocation trace shows Object[] as the number one object being > >> >>created...not IRubyObject[]. Could that be the reason it's slower? > >> >>Some type trickery messing with optimization? > >> >> > >> >>This is very unfortunate because there's no other general-purpose way > >> >>to collect arguments in a handle chain. > >> > > >> >I haven¹t done any comparative benchmarks in that area for a while, but > >> >collecting a single argument is a pretty common pattern in the Magik code, > >> >and I had not seen any substantial difference when we last touched that > >> >area. However we are collecting to plain Object[] so it might be that is > >> >the reason for the difference. If I¹ve got time later this week I¹ll do > >> >some experimenting and check what the current situation is. > >> > >> Okay, I’ve now had a chance to try this in with our language benchmarks > >> and can’t see any significant difference between a hand crafted method and > >> asCOllector, but we are dealing with Object and Object[], so it might be > >> something to do with additional casting. > >> > >> Duncan. > >> > >> ___ > >> mlvm-dev mailing list > >> mlvm-dev@openjdk.java.net > >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net https://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Getting back into indy...need a better argument collector!
Well it only took me five years to circle back to this but I can confirm it is just as bad now as it ever was. And it is definitely due to collecting a single type. I will provide whatever folks need to investigate but it is pretty straightforward. When asking for asCollector of a non-Object[] type, the implementation will first gather arguments into an Object[], and then create a copy of that array as the correct type. So two arrays are created, values are copied twice. I can see this quite clearly in the assembly after letting things optimize. A new Object[] is created and populated, and then a second array of the correct type is created followed by an arraycopy operation. I am once again backing off using asCollector directly to instead provide my own array-construction collector. Should be easy to reproduce the perf issues simply by doing an asCollector that results in some subtype of Object[]. On Thu, Jan 14, 2016 at 8:18 PM Charles Oliver Nutter wrote: > > Thanks Duncan. I will try to look under the covers this evening. > > - Charlie (mobile) > > On Jan 14, 2016 14:39, "MacGregor, Duncan (GE Energy Management)" > wrote: >> >> On 11/01/2016, 11:27, "mlvm-dev on behalf of MacGregor, Duncan (GE Energy >> Management)" > duncan.macgre...@ge.com> wrote: >> >> >On 11/01/2016, 03:16, "mlvm-dev on behalf of Charles Oliver Nutter" >> > >> >wrote: >> >... >> >>With asCollector: 16-17s per iteration >> >> >> >>With hand-written array construction: 7-8s per iteration >> >> >> >>A sampling profile only shows my Ruby code as the top items, and an >> >>allocation trace shows Object[] as the number one object being >> >>created...not IRubyObject[]. Could that be the reason it's slower? >> >>Some type trickery messing with optimization? >> >> >> >>This is very unfortunate because there's no other general-purpose way >> >>to collect arguments in a handle chain. >> > >> >I haven¹t done any comparative benchmarks in that area for a while, but >> >collecting a single argument is a pretty common pattern in the Magik code, >> >and I had not seen any substantial difference when we last touched that >> >area. However we are collecting to plain Object[] so it might be that is >> >the reason for the difference. If I¹ve got time later this week I¹ll do >> >some experimenting and check what the current situation is. >> >> Okay, I’ve now had a chance to try this in with our language benchmarks >> and can’t see any significant difference between a hand crafted method and >> asCOllector, but we are dealing with Object and Object[], so it might be >> something to do with additional casting. >> >> Duncan. >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net https://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: NoClassDefFoundError using LMF against a generated class's handle
To help illustrate a bit, here's a snippit of the code to create the allocator. It succeeds, but the allocator later throws NoClassDefFoundError. https://gist.github.com/headius/cce750221cf73df76cb7f7ce92c1a759 - Charlie On Fri, Jun 29, 2018 at 8:00 PM, Charles Oliver Nutter wrote: > Hello folks! > > I'm improving JRuby's support for instance variables-as-fields, which > involves generating a new JVM class with a field per instance variable in > the Ruby class. > > The construction process for these classes involves an implementation of > my "ObjectAllocator" interface, which is stored with the Ruby class. > > Previously, the generated classes also included a generated child class > that implenented ObjectAllocator appropriately. I was hoping to use > LambdaMetafactory to avoid generating that class, but I'm running into a > problem. > > Say we have a Ruby class with three instance variables. JRuby will > generate a "RubyObject3" class that holds those variables in their own > fields var0, var1, and var2. The process leading up to the bug goes like > this: > > * Generate the RubyObject3 class, in its own classloader that's a child of > the current one. > * Acquire a constructor handle for that class. > * Use that constructor with LambdaMetafactory.metafactory to produce an > allocator-creating call site. > * Invoke that call site to get the one allocator instance we need. > > Note that since the metafactory call requires a Lookup, I am providing it > one from the parent classloader. > > I am able to get through this process without error. However, when I > finally invoke the allocator, I get a NoClassDefFoundError and a stack > trace that ends at the allocator call. > > So...what am I doing wrong? > > - Charlie > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
NoClassDefFoundError using LMF against a generated class's handle
Hello folks! I'm improving JRuby's support for instance variables-as-fields, which involves generating a new JVM class with a field per instance variable in the Ruby class. The construction process for these classes involves an implementation of my "ObjectAllocator" interface, which is stored with the Ruby class. Previously, the generated classes also included a generated child class that implenented ObjectAllocator appropriately. I was hoping to use LambdaMetafactory to avoid generating that class, but I'm running into a problem. Say we have a Ruby class with three instance variables. JRuby will generate a "RubyObject3" class that holds those variables in their own fields var0, var1, and var2. The process leading up to the bug goes like this: * Generate the RubyObject3 class, in its own classloader that's a child of the current one. * Acquire a constructor handle for that class. * Use that constructor with LambdaMetafactory.metafactory to produce an allocator-creating call site. * Invoke that call site to get the one allocator instance we need. Note that since the metafactory call requires a Lookup, I am providing it one from the parent classloader. I am able to get through this process without error. However, when I finally invoke the allocator, I get a NoClassDefFoundError and a stack trace that ends at the allocator call. So...what am I doing wrong? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ClassValue rooting objects after it goes away?
Put it another way: does a static reference from a class to itself prevent that class from being garbage collected? Of course not. ClassValue is intended to be a way to inject pseudo-static data into either a class or a Class. Injecting that data, even if it has a reference back to the class, should not prevent the class from being collected. On Fri, Mar 2, 2018 at 2:19 PM Charles Oliver Nutter wrote: > I have posted a modified version of my description to the main bug report. > > TLDR: ClassValue should not root objects. > > - Charlie > > On Fri, Mar 2, 2018 at 2:13 PM Charles Oliver Nutter > wrote: > >> Yes, it may be the same bug. >> >> In my case, the ClassValue is held by a utility object used for our Java >> integration. That utility object has to live somewhere, so it's held by the >> JRuby runtime instance. There's a strong reference chain leading to the >> ClassValue. >> >> The value is a Ruby representation of the class, with reflected methods >> parsed out and turned into Ruby endpoints. Obviously, the value also >> references the class, either directly or indirectly through reflected >> members. >> >> The Ruby class wrapper is only hard referenced directly if there's an >> instance of the object live and moving through JRuby. It may be referenced >> indirectly through inline caches. >> >> However...I do not believe this should prevent collection of the class >> associated with the ClassValue. >> >> The value referenced in the ClassValue should not constitute a hard >> reference. If it is alive *only* because of its associate with a given >> class, that should not be enough to root either the object or the class. >> >> ClassValue should work like ThreadLocal. If the Thread associated with a >> value goes away, the value reference goes away. ThreadLocal does nothing >> prevent it from being collected. If the Class associated with a Value goes >> away, the same should happen to that Value and it should be collectable >> once all other hard references are gone. >> >> Perhaps I've misunderstood? >> >> - Charlie >> >> On Fri, Mar 2, 2018 at 12:16 PM Vladimir Ivanov < >> vladimir.x.iva...@oracle.com> wrote: >> >>> Charlie, >>> >>> Does it look similar to the following bugs? >>>https://bugs.openjdk.java.net/browse/JDK-8136353 >>>https://bugs.openjdk.java.net/browse/JDK-8169425 >>> >>> If that's the same (and it seems so to me [1]), then speak up and >>> persuade Paul it's an important edge case (as stated in JDK-8169425). >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] new RubyClass(Ruby.this) in >>> >>> public static class Ruby { >>> private ClassValue cache = new >>> ClassValue() { >>> protected RubyClass computeValue(Class type) { >>> return new RubyClass(Ruby.this); >>> } >>> }; >>> >>> On 3/1/18 2:25 AM, Charles Oliver Nutter wrote: >>> > So I don't think we ever closed the loop here. Did anyone on the JDK >>> > side confirm this, file an issue, or fix it? >>> > >>> > We still have ClassValue disabled in JRuby because of the rooting >>> issues >>> > described here and in https://github.com/jruby/jruby/pull/3228. >>> > >>> > - Charlie >>> > >>> > On Thu, Aug 27, 2015 at 7:04 AM Jochen Theodorou >> > <mailto:blackd...@gmx.org>> wrote: >>> > >>> > One more thing... >>> > >>> > Remi, I tried your link with my simplified scenario and it does >>> there >>> > not stop the collection of the classloader >>> > >>> > Am 27.08.2015 11:54, schrieb Jochen Theodorou: >>> > > Hi, >>> > > >>> > > In trying to reproduce the problem outside of Groovy I stumbled >>> > over a >>> > > case case which I think should work >>> > > >>> > > public class MyClassValue extends ClassValue { >>> > > protected Object computeValue(Class type) { >>> > > Dummy ret = new Dummy(); >>> > > Dummy.l.add (this); >>> > > return ret; >>> > > } >>> > > } >>> > > >>> > > class Dummy { >>> >
Re: ClassValue rooting objects after it goes away?
I have posted a modified version of my description to the main bug report. TLDR: ClassValue should not root objects. - Charlie On Fri, Mar 2, 2018 at 2:13 PM Charles Oliver Nutter wrote: > Yes, it may be the same bug. > > In my case, the ClassValue is held by a utility object used for our Java > integration. That utility object has to live somewhere, so it's held by the > JRuby runtime instance. There's a strong reference chain leading to the > ClassValue. > > The value is a Ruby representation of the class, with reflected methods > parsed out and turned into Ruby endpoints. Obviously, the value also > references the class, either directly or indirectly through reflected > members. > > The Ruby class wrapper is only hard referenced directly if there's an > instance of the object live and moving through JRuby. It may be referenced > indirectly through inline caches. > > However...I do not believe this should prevent collection of the class > associated with the ClassValue. > > The value referenced in the ClassValue should not constitute a hard > reference. If it is alive *only* because of its associate with a given > class, that should not be enough to root either the object or the class. > > ClassValue should work like ThreadLocal. If the Thread associated with a > value goes away, the value reference goes away. ThreadLocal does nothing > prevent it from being collected. If the Class associated with a Value goes > away, the same should happen to that Value and it should be collectable > once all other hard references are gone. > > Perhaps I've misunderstood? > > - Charlie > > On Fri, Mar 2, 2018 at 12:16 PM Vladimir Ivanov < > vladimir.x.iva...@oracle.com> wrote: > >> Charlie, >> >> Does it look similar to the following bugs? >>https://bugs.openjdk.java.net/browse/JDK-8136353 >>https://bugs.openjdk.java.net/browse/JDK-8169425 >> >> If that's the same (and it seems so to me [1]), then speak up and >> persuade Paul it's an important edge case (as stated in JDK-8169425). >> >> Best regards, >> Vladimir Ivanov >> >> [1] new RubyClass(Ruby.this) in >> >> public static class Ruby { >> private ClassValue cache = new >> ClassValue() { >> protected RubyClass computeValue(Class type) { >> return new RubyClass(Ruby.this); >> } >> }; >> >> On 3/1/18 2:25 AM, Charles Oliver Nutter wrote: >> > So I don't think we ever closed the loop here. Did anyone on the JDK >> > side confirm this, file an issue, or fix it? >> > >> > We still have ClassValue disabled in JRuby because of the rooting issues >> > described here and in https://github.com/jruby/jruby/pull/3228. >> > >> > - Charlie >> > >> > On Thu, Aug 27, 2015 at 7:04 AM Jochen Theodorou > > <mailto:blackd...@gmx.org>> wrote: >> > >> > One more thing... >> > >> > Remi, I tried your link with my simplified scenario and it does >> there >> > not stop the collection of the classloader >> > >> > Am 27.08.2015 11:54, schrieb Jochen Theodorou: >> > > Hi, >> > > >> > > In trying to reproduce the problem outside of Groovy I stumbled >> > over a >> > > case case which I think should work >> > > >> > > public class MyClassValue extends ClassValue { >> > > protected Object computeValue(Class type) { >> > > Dummy ret = new Dummy(); >> > > Dummy.l.add (this); >> > > return ret; >> > > } >> > > } >> > > >> > > class Dummy { >> > > static final ArrayList l = new ArrayList(); >> > > } >> > > >> > > basically this means there will be a hard reference on the >> ClassValue >> > > somewhere. It can be in a static or non-static field, direct or >> > > indirect. But this won't collect. If I put for example a >> > WeakReference >> > > in between it works again. >> > > >> > > Finally I also tested to put the hard reference in a third class >> > > instead, to avoid this self reference. But it can still not >> collect. >> > > >> > > So I currently have the impression that if anything holds a hard >> > > reference on th
Re: ClassValue rooting objects after it goes away?
Yes, it may be the same bug. In my case, the ClassValue is held by a utility object used for our Java integration. That utility object has to live somewhere, so it's held by the JRuby runtime instance. There's a strong reference chain leading to the ClassValue. The value is a Ruby representation of the class, with reflected methods parsed out and turned into Ruby endpoints. Obviously, the value also references the class, either directly or indirectly through reflected members. The Ruby class wrapper is only hard referenced directly if there's an instance of the object live and moving through JRuby. It may be referenced indirectly through inline caches. However...I do not believe this should prevent collection of the class associated with the ClassValue. The value referenced in the ClassValue should not constitute a hard reference. If it is alive *only* because of its associate with a given class, that should not be enough to root either the object or the class. ClassValue should work like ThreadLocal. If the Thread associated with a value goes away, the value reference goes away. ThreadLocal does nothing prevent it from being collected. If the Class associated with a Value goes away, the same should happen to that Value and it should be collectable once all other hard references are gone. Perhaps I've misunderstood? - Charlie On Fri, Mar 2, 2018 at 12:16 PM Vladimir Ivanov < vladimir.x.iva...@oracle.com> wrote: > Charlie, > > Does it look similar to the following bugs? >https://bugs.openjdk.java.net/browse/JDK-8136353 >https://bugs.openjdk.java.net/browse/JDK-8169425 > > If that's the same (and it seems so to me [1]), then speak up and > persuade Paul it's an important edge case (as stated in JDK-8169425). > > Best regards, > Vladimir Ivanov > > [1] new RubyClass(Ruby.this) in > > public static class Ruby { > private ClassValue cache = new ClassValue() > { > protected RubyClass computeValue(Class type) { > return new RubyClass(Ruby.this); > } > }; > > On 3/1/18 2:25 AM, Charles Oliver Nutter wrote: > > So I don't think we ever closed the loop here. Did anyone on the JDK > > side confirm this, file an issue, or fix it? > > > > We still have ClassValue disabled in JRuby because of the rooting issues > > described here and in https://github.com/jruby/jruby/pull/3228. > > > > - Charlie > > > > On Thu, Aug 27, 2015 at 7:04 AM Jochen Theodorou > <mailto:blackd...@gmx.org>> wrote: > > > > One more thing... > > > > Remi, I tried your link with my simplified scenario and it does there > > not stop the collection of the classloader > > > > Am 27.08.2015 11:54, schrieb Jochen Theodorou: > > > Hi, > > > > > > In trying to reproduce the problem outside of Groovy I stumbled > > over a > > > case case which I think should work > > > > > > public class MyClassValue extends ClassValue { > > > protected Object computeValue(Class type) { > > > Dummy ret = new Dummy(); > > > Dummy.l.add (this); > > > return ret; > > > } > > > } > > > > > > class Dummy { > > > static final ArrayList l = new ArrayList(); > > > } > > > > > > basically this means there will be a hard reference on the > ClassValue > > > somewhere. It can be in a static or non-static field, direct or > > > indirect. But this won't collect. If I put for example a > > WeakReference > > > in between it works again. > > > > > > Finally I also tested to put the hard reference in a third class > > > instead, to avoid this self reference. But it can still not > collect. > > > > > > So I currently have the impression that if anything holds a hard > > > reference on the class value that the classloader cannot be > collected > > > anymore. > > > > > > Unless I misunderstand something here I see that as a bug > > > > > > bye blackdrag > > > > > > > > > -- > > Jochen "blackdrag" Theodorou > > blog: http://blackdragsview.blogspot.com/ > > > > ___ > > mlvm-dev mailing list > > mlvm-dev@openjdk.java.net <mailto:mlvm-dev@openjdk.java.net> > > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > > -- > > > > - Charlie (mobile) > > > > > > > > ___ > > mlvm-dev mailing list > > mlvm-dev@openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > -- - Charlie (mobile) ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Interface injection in an age of default interface methods
Here's an oldie but goodie: what ever happened to interface injection? For those unfamiliar, we dynlang guys had an idea years ago that if we could simply "force" an interface into an existing Java class, with a handler dangling off the side, we could pass normal Java objects through languages that have their own supertypes without needing a wrapper. So in the case of JRuby, where every method signature and every local variable is typed IRubyObject, we'd inject a default impl of IRubyObject into java.lang.Object, and it would know how to handle all our dispatch logic. Back in the day, one of the sticky bits was wiring together the implementation of all those interface methods. These days, perhaps that's not a problem with default interface methods from Java 8? Perhaps the JVM could (at some point) even allow you to cast an object to *any* interface, so long as all that interface's methods had default or natural implementations? - Charlie -- - Charlie (mobile) ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ClassValue rooting objects after it goes away?
So I don't think we ever closed the loop here. Did anyone on the JDK side confirm this, file an issue, or fix it? We still have ClassValue disabled in JRuby because of the rooting issues described here and in https://github.com/jruby/jruby/pull/3228. - Charlie On Thu, Aug 27, 2015 at 7:04 AM Jochen Theodorou wrote: > One more thing... > > Remi, I tried your link with my simplified scenario and it does there > not stop the collection of the classloader > > Am 27.08.2015 11:54, schrieb Jochen Theodorou: > > Hi, > > > > In trying to reproduce the problem outside of Groovy I stumbled over a > > case case which I think should work > > > > public class MyClassValue extends ClassValue { > > protected Object computeValue(Class type) { > > Dummy ret = new Dummy(); > > Dummy.l.add (this); > > return ret; > > } > > } > > > > class Dummy { > > static final ArrayList l = new ArrayList(); > > } > > > > basically this means there will be a hard reference on the ClassValue > > somewhere. It can be in a static or non-static field, direct or > > indirect. But this won't collect. If I put for example a WeakReference > > in between it works again. > > > > Finally I also tested to put the hard reference in a third class > > instead, to avoid this self reference. But it can still not collect. > > > > So I currently have the impression that if anything holds a hard > > reference on the class value that the classloader cannot be collected > > anymore. > > > > Unless I misunderstand something here I see that as a bug > > > > bye blackdrag > > > > > -- > Jochen "blackdrag" Theodorou > blog: http://blackdragsview.blogspot.com/ > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > -- - Charlie (mobile) ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Error, Java 8, lambda form compilation
Ah-ha...I added some logging, which of course made the error go away...but about ten tests later I got a metaspace OOM. Could be this was all just a memory issue, but it would be nice if the error didn't get swallowed. - Charlie On Wed, Feb 28, 2018 at 12:40 PM Charles Oliver Nutter wrote: > Hey, I'm still not sure how best to deal with this, but we've been > consistently getting a similar error at the same place. It has kept JRuby > master CI red for many weeks. > > The problem does not reproduce when running in isolation...only in a long > test run, and so far only on Travis CI (Ubuntu 16.something, Java 8u151). > > Looking at the code, it appears the dropArguments call below (called from > MethodHandles.guardWithTest:3018) was replaced with some new code and > dropArgumentsToMatch in 9. I have not read through logs to see if that > change might be related. > > Unhandled Java exception: java.lang.InternalError: > exactInvoker=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ > [exec] t3:L=BoundMethodHandle$Species_LL.argL1(a0:L); > [exec] t4:L=MethodHandle.invokeBasic(t3:L); > [exec] t5:L=BoundMethodHandle$Species_LL.argL0(a0:L); > [exec] t6:V=Invokers.checkExactType(t4:L,t5:L); > [exec] t7:V=Invokers.checkCustomized(t4:L); > [exec] t8:I=MethodHandle.invokeBasic(t4:L);t8:I} > [exec] java.lang.InternalError: > exactInvoker=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ > [exec] t3:L=BoundMethodHandle$Species_LL.argL1(a0:L); > [exec] t4:L=MethodHandle.invokeBasic(t3:L); > [exec] t5:L=BoundMethodHandle$Species_LL.argL0(a0:L); > [exec] t6:V=Invokers.checkExactType(t4:L,t5:L); > [exec] t7:V=Invokers.checkCustomized(t4:L); > [exec] t8:I=MethodHandle.invokeBasic(t4:L);t8:I} > [exec]newInternalError at > java/lang/invoke/MethodHandleStatics.java:127 > [exec] compileToBytecode at java/lang/invoke/LambdaForm.java:660 > [exec] prepare at java/lang/invoke/LambdaForm.java:635 > [exec] at java/lang/invoke/MethodHandle.java:461 > [exec] at java/lang/invoke/BoundMethodHandle.java:58 > [exec] at java/lang/invoke/Species_LL:-1 > [exec]copyWith at java/lang/invoke/Species_LL:-1 > [exec] dropArguments at java/lang/invoke/MethodHandles.java:2465 > [exec] guardWithTest at java/lang/invoke/MethodHandles.java:3018 > [exec] guardWithTest at java/lang/invoke/SwitchPoint.java:173 > [exec] searchConst at > org/jruby/ir/targets/ConstantLookupSite.java:103 > > > On Fri, Jan 12, 2018 at 9:54 AM Charles Oliver Nutter > wrote: > >> I wish I could provide more info here. Just got another one in CI: >> >> [exec] [1603/8763] >> TestBenchmark#test_benchmark_makes_extra_calcultations_with_an_Array_at_the_end_of_the_benchmark_and_show_the_resultUnhandled >> Java exception: java.lang.BootstrapMethodError: call site initialization >> exception >> [exec] java.lang.BootstrapMethodError: call site initialization >> exception >> [exec] makeSite at java/lang/invoke/CallSite.java:341 >> [exec] linkCallSiteImpl at >> java/lang/invoke/MethodHandleNatives.java:307 >> [exec] linkCallSite at >> java/lang/invoke/MethodHandleNatives.java:297 >> [exec] block in autorun at >> /home/travis/build/jruby/jruby/test/mri/lib/test/unit.rb:935 >> [exec] callDirect at >> org/jruby/runtime/CompiledIRBlockBody.java:151 >> [exec] call at org/jruby/runtime/IRBlockBody.java:77 >> [exec] call at org/jruby/runtime/Block.java:124 >> [exec] call at org/jruby/RubyProc.java:288 >> [exec] call at org/jruby/RubyProc.java:272 >> [exec] tearDown at org/jruby/Ruby.java:3276 >> [exec] tearDown at org/jruby/Ruby.java:3249 >> [exec]internalRun at org/jruby/Main.java:309 >> [exec]run at org/jruby/Main.java:232 >> [exec] main at org/jruby/Main.java:204 >> [exec] >> [exec] Caused by: >> [exec] java.lang.InternalError: >> BMH.reinvoke=Lambda(a0:L/SpeciesData,a1:L,a2:L,a3:L)=>{ >> [exec] t4:L=Species_L.argL0(a0:L); >> [exec] t5:L=MethodHandle.invokeBasic(t4:L,a1:L,a2:L,a3:L);t5:L} >> [exec] newInternalError at >> java/lang/invoke/MethodHandleStatics.java:127 >> [exec]compileToBytecode at java/lang/invoke/LambdaForm.java:660 >> [exec] prepare at java/lang/invoke/LambdaFor
Re: Error, Java 8, lambda form compilation
Hey, I'm still not sure how best to deal with this, but we've been consistently getting a similar error at the same place. It has kept JRuby master CI red for many weeks. The problem does not reproduce when running in isolation...only in a long test run, and so far only on Travis CI (Ubuntu 16.something, Java 8u151). Looking at the code, it appears the dropArguments call below (called from MethodHandles.guardWithTest:3018) was replaced with some new code and dropArgumentsToMatch in 9. I have not read through logs to see if that change might be related. Unhandled Java exception: java.lang.InternalError: exactInvoker=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ [exec] t3:L=BoundMethodHandle$Species_LL.argL1(a0:L); [exec] t4:L=MethodHandle.invokeBasic(t3:L); [exec] t5:L=BoundMethodHandle$Species_LL.argL0(a0:L); [exec] t6:V=Invokers.checkExactType(t4:L,t5:L); [exec] t7:V=Invokers.checkCustomized(t4:L); [exec] t8:I=MethodHandle.invokeBasic(t4:L);t8:I} [exec] java.lang.InternalError: exactInvoker=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ [exec] t3:L=BoundMethodHandle$Species_LL.argL1(a0:L); [exec] t4:L=MethodHandle.invokeBasic(t3:L); [exec] t5:L=BoundMethodHandle$Species_LL.argL0(a0:L); [exec] t6:V=Invokers.checkExactType(t4:L,t5:L); [exec] t7:V=Invokers.checkCustomized(t4:L); [exec] t8:I=MethodHandle.invokeBasic(t4:L);t8:I} [exec]newInternalError at java/lang/invoke/MethodHandleStatics.java:127 [exec] compileToBytecode at java/lang/invoke/LambdaForm.java:660 [exec] prepare at java/lang/invoke/LambdaForm.java:635 [exec] at java/lang/invoke/MethodHandle.java:461 [exec] at java/lang/invoke/BoundMethodHandle.java:58 [exec] at java/lang/invoke/Species_LL:-1 [exec]copyWith at java/lang/invoke/Species_LL:-1 [exec] dropArguments at java/lang/invoke/MethodHandles.java:2465 [exec] guardWithTest at java/lang/invoke/MethodHandles.java:3018 [exec] guardWithTest at java/lang/invoke/SwitchPoint.java:173 [exec] searchConst at org/jruby/ir/targets/ConstantLookupSite.java:103 On Fri, Jan 12, 2018 at 9:54 AM Charles Oliver Nutter wrote: > I wish I could provide more info here. Just got another one in CI: > > [exec] [1603/8763] > TestBenchmark#test_benchmark_makes_extra_calcultations_with_an_Array_at_the_end_of_the_benchmark_and_show_the_resultUnhandled > Java exception: java.lang.BootstrapMethodError: call site initialization > exception > [exec] java.lang.BootstrapMethodError: call site initialization exception > [exec] makeSite at java/lang/invoke/CallSite.java:341 > [exec] linkCallSiteImpl at > java/lang/invoke/MethodHandleNatives.java:307 > [exec] linkCallSite at > java/lang/invoke/MethodHandleNatives.java:297 > [exec] block in autorun at > /home/travis/build/jruby/jruby/test/mri/lib/test/unit.rb:935 > [exec] callDirect at > org/jruby/runtime/CompiledIRBlockBody.java:151 > [exec] call at org/jruby/runtime/IRBlockBody.java:77 > [exec] call at org/jruby/runtime/Block.java:124 > [exec] call at org/jruby/RubyProc.java:288 > [exec] call at org/jruby/RubyProc.java:272 > [exec] tearDown at org/jruby/Ruby.java:3276 > [exec] tearDown at org/jruby/Ruby.java:3249 > [exec]internalRun at org/jruby/Main.java:309 > [exec]run at org/jruby/Main.java:232 > [exec] main at org/jruby/Main.java:204 > [exec] > [exec] Caused by: > [exec] java.lang.InternalError: > BMH.reinvoke=Lambda(a0:L/SpeciesData,a1:L,a2:L,a3:L)=>{ > [exec] t4:L=Species_L.argL0(a0:L); > [exec] t5:L=MethodHandle.invokeBasic(t4:L,a1:L,a2:L,a3:L);t5:L} > [exec] newInternalError at > java/lang/invoke/MethodHandleStatics.java:127 > [exec]compileToBytecode at java/lang/invoke/LambdaForm.java:660 > [exec] prepare at java/lang/invoke/LambdaForm.java:635 > [exec]at java/lang/invoke/MethodHandle.java:461 > [exec]at java/lang/invoke/BoundMethodHandle.java:58 > [exec]at > java/lang/invoke/BoundMethodHandle.java:211 > [exec] make at > java/lang/invoke/BoundMethodHandle.java:224 > [exec]makeReinvoker at > java/lang/invoke/BoundMethodHandle.java:141 > [exec] rebind at > java/lang/invoke/DirectMethodHandle.java:130 > [exec] insertArguments at java/lang/invoke/MethodHandles.java:2371 > [exec] up at > com/headius/invokebinder/tr
Performance of non-static method handles
Hey folks! I'm running some simple benchmarks for my FOSDEM handles talk and wanted to reopen discussion about the performance of non-static-final method handles. In my test, I just try to call a method that adds given argument to a static long. The numbers for reflection and static final handle are what I'd expect, with the latter basically being equivalent to a direct call: Direct: 0.05ns/call Reflected: 3ns/call static final Handle: 0.05ns/call If the handle is coming from an instance field or local variable, however, performance is only slightly faster than reflection. I assume the only real improvement in this case is that it doesn't box the long value I pass in. local var Handle: 2.7ns/call What can we do to improve the performance of non-static method handle invocation? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Error, Java 8, lambda form compilation
I wish I could provide more info here. Just got another one in CI: [exec] [1603/8763] TestBenchmark#test_benchmark_makes_extra_calcultations_with_an_Array_at_the_end_of_the_benchmark_and_show_the_resultUnhandled Java exception: java.lang.BootstrapMethodError: call site initialization exception [exec] java.lang.BootstrapMethodError: call site initialization exception [exec] makeSite at java/lang/invoke/CallSite.java:341 [exec] linkCallSiteImpl at java/lang/invoke/MethodHandleNatives.java:307 [exec] linkCallSite at java/lang/invoke/MethodHandleNatives.java:297 [exec] block in autorun at /home/travis/build/jruby/jruby/test/mri/lib/test/unit.rb:935 [exec] callDirect at org/jruby/runtime/CompiledIRBlockBody.java:151 [exec] call at org/jruby/runtime/IRBlockBody.java:77 [exec] call at org/jruby/runtime/Block.java:124 [exec] call at org/jruby/RubyProc.java:288 [exec] call at org/jruby/RubyProc.java:272 [exec] tearDown at org/jruby/Ruby.java:3276 [exec] tearDown at org/jruby/Ruby.java:3249 [exec]internalRun at org/jruby/Main.java:309 [exec]run at org/jruby/Main.java:232 [exec] main at org/jruby/Main.java:204 [exec] [exec] Caused by: [exec] java.lang.InternalError: BMH.reinvoke=Lambda(a0:L/SpeciesData,a1:L,a2:L,a3:L)=>{ [exec] t4:L=Species_L.argL0(a0:L); [exec] t5:L=MethodHandle.invokeBasic(t4:L,a1:L,a2:L,a3:L);t5:L} [exec] newInternalError at java/lang/invoke/MethodHandleStatics.java:127 [exec]compileToBytecode at java/lang/invoke/LambdaForm.java:660 [exec] prepare at java/lang/invoke/LambdaForm.java:635 [exec]at java/lang/invoke/MethodHandle.java:461 [exec]at java/lang/invoke/BoundMethodHandle.java:58 [exec]at java/lang/invoke/BoundMethodHandle.java:211 [exec] make at java/lang/invoke/BoundMethodHandle.java:224 [exec]makeReinvoker at java/lang/invoke/BoundMethodHandle.java:141 [exec] rebind at java/lang/invoke/DirectMethodHandle.java:130 [exec] insertArguments at java/lang/invoke/MethodHandles.java:2371 [exec] up at com/headius/invokebinder/transform/Insert.java:99 On Tue, Jan 9, 2018 at 12:18 PM Vladimir Ivanov < vladimir.x.iva...@oracle.com> wrote: > Thanks, Charlie. > > Unfortunately, it doesn't give much info without the exception which > caused it. > > jdk/src/share/classes/java/lang/invoke/LambdaForm.java: > 659 } catch (Error | Exception ex) { > 660 throw newInternalError(this.toString(), ex); > 661 } > > Best regards, > Vladimir Ivanov > > On 1/9/18 9:10 PM, Charles Oliver Nutter wrote: > > Unfortunately this just happened in one build, but I thought I'd post it > > here for posterity. > > > > Unhandled Java exception: java.lang.InternalError: > identity_L=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ > > [exec] t3:L=Species_L.argL0(a0:L);t3:L} > > [exec] java.lang.InternalError: > identity_L=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ > > [exec] t3:L=Species_L.argL0(a0:L);t3:L} > > [exec]newInternalError at > java/lang/invoke/MethodHandleStatics.java:127 > > [exec] compileToBytecode at java/lang/invoke/LambdaForm.java:660 > > [exec] prepare at java/lang/invoke/LambdaForm.java:635 > > [exec] at > java/lang/invoke/MethodHandle.java:461 > > [exec] at > java/lang/invoke/BoundMethodHandle.java:58 > > [exec] at > java/lang/invoke/BoundMethodHandle.java:211 > > [exec]copyWith at > java/lang/invoke/BoundMethodHandle.java:228 > > [exec] dropArguments at > java/lang/invoke/MethodHandles.java:2465 > > [exec] dropArguments at > java/lang/invoke/MethodHandles.java:2535 > > [exec] up at > com/headius/invokebinder/transform/Drop.java:39 > > [exec] invoke at > com/headius/invokebinder/Binder.java:1143 > > [exec]constant at > com/headius/invokebinder/Binder.java:1116 > > [exec] searchConst at > org/jruby/ir/targets/ConstantLookupSite.java:98 > > [exec]block in autorun at > /home/travis/build/jruby/jruby/test/mri/lib/test/unit.rb:935 > > [exec] callDirect at > org/jruby/runtime/CompiledIRBlockBody.java:151 > > [exec]call at org/jruby/runtime/IRBlockBody.java:77 > > [exec]call at org/jruby/runtime/Block.jav
Error, Java 8, lambda form compilation
Unfortunately this just happened in one build, but I thought I'd post it here for posterity. Unhandled Java exception: java.lang.InternalError: identity_L=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ [exec] t3:L=Species_L.argL0(a0:L);t3:L} [exec] java.lang.InternalError: identity_L=Lambda(a0:L/SpeciesData,a1:L,a2:L)=>{ [exec] t3:L=Species_L.argL0(a0:L);t3:L} [exec]newInternalError at java/lang/invoke/MethodHandleStatics.java:127 [exec] compileToBytecode at java/lang/invoke/LambdaForm.java:660 [exec] prepare at java/lang/invoke/LambdaForm.java:635 [exec] at java/lang/invoke/MethodHandle.java:461 [exec] at java/lang/invoke/BoundMethodHandle.java:58 [exec] at java/lang/invoke/BoundMethodHandle.java:211 [exec]copyWith at java/lang/invoke/BoundMethodHandle.java:228 [exec] dropArguments at java/lang/invoke/MethodHandles.java:2465 [exec] dropArguments at java/lang/invoke/MethodHandles.java:2535 [exec] up at com/headius/invokebinder/transform/Drop.java:39 [exec] invoke at com/headius/invokebinder/Binder.java:1143 [exec]constant at com/headius/invokebinder/Binder.java:1116 [exec] searchConst at org/jruby/ir/targets/ConstantLookupSite.java:98 [exec]block in autorun at /home/travis/build/jruby/jruby/test/mri/lib/test/unit.rb:935 [exec] callDirect at org/jruby/runtime/CompiledIRBlockBody.java:151 [exec]call at org/jruby/runtime/IRBlockBody.java:77 [exec]call at org/jruby/runtime/Block.java:124 [exec]call at org/jruby/RubyProc.java:288 [exec]call at org/jruby/RubyProc.java:272 [exec]tearDown at org/jruby/Ruby.java:3276 [exec]tearDown at org/jruby/Ruby.java:3249 [exec] internalRun at org/jruby/Main.java:309 [exec] run at org/jruby/Main.java:232 [exec]main at org/jruby/Main.java:204 - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Writing a compiler to handles, but filter seems to executed in reverse
Thanks a bunch y'all. I'm thinking invokebinder should do the "right thing" and manually apply the filters in the proper order on affected JVMs...or perhaps always. Warm-up notwithstanding, what cost would we pay to always do single filter MHs versus doing them as a group that instead becomes single LF adaptations? On Wed, Jan 3, 2018, 21:22 John Rose wrote: > Thanks, IBM!! > > Filed: https://bugs.openjdk.java.net/browse/JDK-8194554 > > On Jan 3, 2018, at 12:04 PM, Remi Forax wrote: > > > IBM implementation uses the left to right order ! > I've just tested with the latest Java 8 available. > > Java(TM) SE Runtime Environment (build 8.0.5.7 - > pxa6480sr5fp7-20171216_01(SR5 FP7)) > IBM J9 VM (build 2.9, JRE 1.8.0 Linux amd64-64 Compressed References > 20171215_373586 (JIT enabled, AOT enabled) > OpenJ9 - 5aa401f > OMR - 101e793 > IBM - b4a79bf) > > so it's an implementation bug, #2 seems to be the right solution. > > Rémi > > -- > > *De: *"John Rose" > *À: *"Da Vinci Machine Project" > *Envoyé: *Mercredi 3 Janvier 2018 20:37:42 > *Objet: *Re: Writing a compiler to handles, but filter seems to executed > in reverse > > On Jan 2, 2018, at 12:35 PM, Charles Oliver Nutter > wrote: > > > Is there a good justification for doing it this way, rather than having > > filterArguments start with the *last* filter nearest the target? > > > No, it's a bug. The javadoc API spec. does not emphasize the ordering > of the filter invocations, but the pseudocode makes it pretty clear what > order things should come in. Certainly the spec. does not promise the > current behavior. When I wrote the spec. I intended the Java argument > evaluation order to apply, and the filters to be executed left-to-right. > And then, when I wrote the code, I used an accumulative algorithm > with a for-each loop, leading indirectly to reverse evaluation order. > Oops. > > There are two ways forward: > > 1. Declare the spec. ambiguous, and document the current behavior > as the de facto standard. > > 2. Declare the spec. unambiguous, change the behavior to left-to-right > as a bug fix, and clarify the spec. > > I think we can try for #2, on the grounds that multiple filters are a rare > occurrence. The risk is that existing code that uses multiple filters > *and* > has side effect ordering constraints between the filters will break. > > Question: What does the IBM JVM do? I think they have a very > different implementation, and they are supposed to follow the spec. > > — John > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Writing a compiler to handles, but filter seems to executed in reverse
So I have some basic expressions working in my pseudo-compiler, and the experiment has been interesting so far. A few things I've learned: (for code "a = 1; b = 2; (a + b) > 1", here's the assembly output: https://gist.github.com/headius/f765260a00590fc2b4cd033b5a657e6b) * The approach is interesting and stretches handles quite a bit, but it takes a long time to heat up and longer to generate native code. This may be acceptable on platforms where user code can't load new JVM bytecode (assuming the handle impl on that platform produces decent code). * My mechanism of using Object[] to hold local variables seems to break escape analysis on both hotspot and graal, probably because that array write is too opaque to escape through. The Object[] itself is also constructed in the same compilation unit, though, that doesn't appear to tidy up either. * According to LogCompilation, everything inlines, including the call to my type-checking "add" method for the "+" call here, but... * According to PrintAssembly, the direct handles to "add" and "lt" don't actually appear to inline. Why? 0x00011418f1f1: movabs rcx,0x76c5065b0;*invokestatic linkToStatic {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.invoke.LambdaForm$DMH/2137211482::invokeStatic_LL_L@11 ; - java.lang.invoke.LambdaForm$BMH/522188921::reinvoke@50 ; - java.lang.invoke.LambdaForm$BMH/522188921::reinvoke@15 ; - java.lang.invoke.LambdaForm$MH/1973471376::identity_L@68 ; {oop(a 'java/lang/invoke/MemberName' = {method} {0x00010efd7358} 'add' '(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;' in 'com/headius/jruby/HandleCompiler')} 0x00011418f1fb: movQWORD PTR [rsp+0xb0],rax 0x00011418f203: nop 0x00011418f204: nop 0x00011418f205: nop 0x00011418f206: nop 0x00011418f207: call 0x0001138f2420 ; OopMap{[176]=Oop off=460} ;*invokestatic linkToStatic {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.invoke.LambdaForm$DMH/2137211482::invokeStatic_LL_L@11 ; - java.lang.invoke.LambdaForm$BMH/522188921::reinvoke@50 ; - java.lang.invoke.LambdaForm$BMH/522188921::reinvoke@15 ; - java.lang.invoke.LambdaForm$MH/1973471376::identity_L@68 ; {static_call} 0x00011418f20c: movrsi,rax 0x00011418f20f: movabs rdx,0x76c511bb8; {oop(a 'java/lang/Long' = 1)} 0x00011418f219: movabs rcx,0x76c512068;*invokestatic linkToStatic {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.invoke.LambdaForm$DMH/2137211482::invokeStatic_LL_L@11 ; - java.lang.invoke.LambdaForm$BMH/522188921::reinvoke@50 ; - java.lang.invoke.LambdaForm$MH/1973471376::identity_L@68 ; {oop(a 'java/lang/invoke/MemberName' = {method} {0x00010efd77d0} 'gt' '(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;' in 'com/headius/jruby/HandleCompiler')} 0x00011418f223: nop 0x00011418f224: nop 0x00011418f225: nop 0x00011418f226: nop 0x00011418f227: call 0x0001138f2420 ; OopMap{off=492} ;*invokestatic linkToStatic {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.invoke.LambdaForm$DMH/2137211482::invokeStatic_LL_L@11 ; - java.lang.invoke.LambdaForm$BMH/522188921::reinvoke@50 ; - java.lang.invoke.LambdaForm$MH/1973471376::identity_L@68 ; {static_call} On Tue, Jan 2, 2018 at 3:54 PM Charles Oliver Nutter wrote: > I have released invokebinder 1.11, which includes Binder.filterForward > that guarantees left-to-right evaluation of the filters (by doing them > individually). > > I'd still like to understand if this is intentional behavior in OpenJDK or > if it is perhaps a bug. > > - Charlie > > On Tue, Jan 2, 2018 at 3:10 PM Charles Oliver Nutter > wrote: > >> Yes, I figured I would need it for that too, but this filter behavior >> sent me off on a weird tangent. >> >> It is gros
Re: Writing a compiler to handles, but filter seems to executed in reverse
I have released invokebinder 1.11, which includes Binder.filterForward that guarantees left-to-right evaluation of the filters (by doing them individually). I'd still like to understand if this is intentional behavior in OpenJDK or if it is perhaps a bug. - Charlie On Tue, Jan 2, 2018 at 3:10 PM Charles Oliver Nutter wrote: > Yes, I figured I would need it for that too, but this filter behavior sent > me off on a weird tangent. > > It is gross in code to do the filters manually in forward order, but > perhaps it's not actually a big deal? OpenJDK's impl applies each filter as > its own layer anyway. > > - Charlie > > On Tue, Jan 2, 2018 at 3:04 PM Remi Forax wrote: > >> You also need the loop combinator for implementing early return (the >> return keyword), >> I think i have an example of how to map a small language to a loop >> combinator somewhere, >> i will try to find that (or rewrite it) tomorrow. >> >> cheers, >> Rémi >> >> -- >> >> *De: *"Charles Oliver Nutter" >> *À: *"Da Vinci Machine Project" >> *Envoyé: *Mardi 2 Janvier 2018 21:36:33 >> *Objet: *Re: Writing a compiler to handles, but filter seems to executed >> in reverse >> >> An alternative workaround: I do the filters myself, manually, in the >> order that I want them to executed. Also gross. >> >> On Tue, Jan 2, 2018 at 2:35 PM Charles Oliver Nutter >> wrote: >> >>> Ahh I believe I see it now. >>> filterArguments starts with the first filter, and wraps the incoming >>> target handle with each in turn. However, because it's starting at the >>> target, you get the filters stacked up in reverse order: >>> >>> filter(target, 0, a, b, c, d) >>> >>> ends up as >>> >>> d_filter(c_filter(b_filter(a_filter(target >>> >>> And so naturally when invoked, they execute in reverse order. >>> >>> This seems I am surprised we have not run into this as a problem, but I >>> believe most of my uses of filter in JRuby have been pure functions where >>> order was not important (except for error conditions). >>> >>> Now in looking for a fix, I've run into the nasty workaround required to >>> get filters to execute in the correct order: you have to reverse the >>> filters, and then reverse the results again. This is far from desirable, >>> since it requires at least one permute to put the results back in proper >>> order. >>> >>> Is there a good justification for doing it this way, rather than having >>> filterArguments start with the *last* filter nearest the target? >>> >>> - Charlie >>> >>> On Tue, Jan 2, 2018 at 2:17 PM Charles Oliver Nutter < >>> head...@headius.com> wrote: >>> >>>> Hello all, long time no write! >>>> I'm finally playing with writing a "compiler" for JRuby that uses only >>>> method handles to represent code structure. For most simple expressions, >>>> this obviously works well. However I'm having trouble with blocks of code >>>> that contain multiple expressions. >>>> >>>> Starting with the standard call signature through the handle tree, we >>>> have a basic (Object[])Object type. The Object[] contains local variable >>>> state for the script, and will be as wide as there are local variables. AST >>>> nodes are basically compiled into little functions that take in the >>>> variable state and produce a value. In this way, every expression in the >>>> tree can be compiled, including local variable sets and gets, loops, and so >>>> on. >>>> >>>> Now the tricky bit... >>>> >>>> The root node for a given script contains one or more expressions that >>>> should be executed in sequence, with the final result being returned. The >>>> way I'm handling this in method handles is as follows (invokebinder code >>>> but hopefully easy to read): >>>> >>>> MethodHandle[] handles = >>>> Arrays >>>> .stream(rootNode.children()) >>>> .map(node -> compile(node)) >>>> .toArray(n -> new MethodHandle[n]); >>>> >>>> return Binder.from(Object.class, Object[].class) >>>> .permute(new int[handles.length]) >>>> .filter(0, handles) >&g
Re: Writing a compiler to handles, but filter seems to executed in reverse
Yes, I figured I would need it for that too, but this filter behavior sent me off on a weird tangent. It is gross in code to do the filters manually in forward order, but perhaps it's not actually a big deal? OpenJDK's impl applies each filter as its own layer anyway. - Charlie On Tue, Jan 2, 2018 at 3:04 PM Remi Forax wrote: > You also need the loop combinator for implementing early return (the > return keyword), > I think i have an example of how to map a small language to a loop > combinator somewhere, > i will try to find that (or rewrite it) tomorrow. > > cheers, > Rémi > > -------------- > > *De: *"Charles Oliver Nutter" > *À: *"Da Vinci Machine Project" > *Envoyé: *Mardi 2 Janvier 2018 21:36:33 > *Objet: *Re: Writing a compiler to handles, but filter seems to executed > in reverse > > An alternative workaround: I do the filters myself, manually, in the order > that I want them to executed. Also gross. > > On Tue, Jan 2, 2018 at 2:35 PM Charles Oliver Nutter > wrote: > >> Ahh I believe I see it now. >> filterArguments starts with the first filter, and wraps the incoming >> target handle with each in turn. However, because it's starting at the >> target, you get the filters stacked up in reverse order: >> >> filter(target, 0, a, b, c, d) >> >> ends up as >> >> d_filter(c_filter(b_filter(a_filter(target >> >> And so naturally when invoked, they execute in reverse order. >> >> This seems I am surprised we have not run into this as a problem, but I >> believe most of my uses of filter in JRuby have been pure functions where >> order was not important (except for error conditions). >> >> Now in looking for a fix, I've run into the nasty workaround required to >> get filters to execute in the correct order: you have to reverse the >> filters, and then reverse the results again. This is far from desirable, >> since it requires at least one permute to put the results back in proper >> order. >> >> Is there a good justification for doing it this way, rather than having >> filterArguments start with the *last* filter nearest the target? >> >> - Charlie >> >> On Tue, Jan 2, 2018 at 2:17 PM Charles Oliver Nutter >> wrote: >> >>> Hello all, long time no write! >>> I'm finally playing with writing a "compiler" for JRuby that uses only >>> method handles to represent code structure. For most simple expressions, >>> this obviously works well. However I'm having trouble with blocks of code >>> that contain multiple expressions. >>> >>> Starting with the standard call signature through the handle tree, we >>> have a basic (Object[])Object type. The Object[] contains local variable >>> state for the script, and will be as wide as there are local variables. AST >>> nodes are basically compiled into little functions that take in the >>> variable state and produce a value. In this way, every expression in the >>> tree can be compiled, including local variable sets and gets, loops, and so >>> on. >>> >>> Now the tricky bit... >>> >>> The root node for a given script contains one or more expressions that >>> should be executed in sequence, with the final result being returned. The >>> way I'm handling this in method handles is as follows (invokebinder code >>> but hopefully easy to read): >>> >>> MethodHandle[] handles = >>> Arrays >>> .stream(rootNode.children()) >>> .map(node -> compile(node)) >>> .toArray(n -> new MethodHandle[n]); >>> >>> return Binder.from(Object.class, Object[].class) >>> .permute(new int[handles.length]) >>> .filter(0, handles) >>> .drop(0, handles.length - 1) >>> .identity(); >>> >>> In pseudo-code, this basically duplicates the Object[] as many times as >>> there are lines of code to execute, and then uses filterArguments to >>> evaluate each in turn. Then everything but the last result is culled and >>> the final result is returned. >>> >>> Unfortunately, this doesn't work right: filterArguments appears to >>> execute in reverse order. When I try to run a simple script like "a = 1; a" >>> the "a" value comes back null, because it is executed first. >>> >>> Is this expected? Do filters, when executed, actually process from the >>> last argument back, rather than the first argument forward? >>> >>> Note: I know this would be possible to do with guaranteed ordering using >>> the new loop combinators in 9. I'm working up to that for examples for a >>> talk. >>> >>> - Charlie >>> >>> > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Writing a compiler to handles, but filter seems to executed in reverse
An alternative workaround: I do the filters myself, manually, in the order that I want them to executed. Also gross. On Tue, Jan 2, 2018 at 2:35 PM Charles Oliver Nutter wrote: > Ahh I believe I see it now. > > filterArguments starts with the first filter, and wraps the incoming > target handle with each in turn. However, because it's starting at the > target, you get the filters stacked up in reverse order: > > filter(target, 0, a, b, c, d) > > ends up as > > d_filter(c_filter(b_filter(a_filter(target > > And so naturally when invoked, they execute in reverse order. > > This seems I am surprised we have not run into this as a problem, but I > believe most of my uses of filter in JRuby have been pure functions where > order was not important (except for error conditions). > > Now in looking for a fix, I've run into the nasty workaround required to > get filters to execute in the correct order: you have to reverse the > filters, and then reverse the results again. This is far from desirable, > since it requires at least one permute to put the results back in proper > order. > > Is there a good justification for doing it this way, rather than having > filterArguments start with the *last* filter nearest the target? > > - Charlie > > On Tue, Jan 2, 2018 at 2:17 PM Charles Oliver Nutter > wrote: > >> Hello all, long time no write! >> >> I'm finally playing with writing a "compiler" for JRuby that uses only >> method handles to represent code structure. For most simple expressions, >> this obviously works well. However I'm having trouble with blocks of code >> that contain multiple expressions. >> >> Starting with the standard call signature through the handle tree, we >> have a basic (Object[])Object type. The Object[] contains local variable >> state for the script, and will be as wide as there are local variables. AST >> nodes are basically compiled into little functions that take in the >> variable state and produce a value. In this way, every expression in the >> tree can be compiled, including local variable sets and gets, loops, and so >> on. >> >> Now the tricky bit... >> >> The root node for a given script contains one or more expressions that >> should be executed in sequence, with the final result being returned. The >> way I'm handling this in method handles is as follows (invokebinder code >> but hopefully easy to read): >> >> MethodHandle[] handles = >> Arrays >> .stream(rootNode.children()) >> .map(node -> compile(node)) >> .toArray(n -> new MethodHandle[n]); >> >> return Binder.from(Object.class, Object[].class) >> .permute(new int[handles.length]) >> .filter(0, handles) >> .drop(0, handles.length - 1) >> .identity(); >> >> In pseudo-code, this basically duplicates the Object[] as many times as >> there are lines of code to execute, and then uses filterArguments to >> evaluate each in turn. Then everything but the last result is culled and >> the final result is returned. >> >> Unfortunately, this doesn't work right: filterArguments appears to >> execute in reverse order. When I try to run a simple script like "a = 1; a" >> the "a" value comes back null, because it is executed first. >> >> Is this expected? Do filters, when executed, actually process from the >> last argument back, rather than the first argument forward? >> >> Note: I know this would be possible to do with guaranteed ordering using >> the new loop combinators in 9. I'm working up to that for examples for a >> talk. >> >> - Charlie >> >> ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Writing a compiler to handles, but filter seems to executed in reverse
Ahh I believe I see it now. filterArguments starts with the first filter, and wraps the incoming target handle with each in turn. However, because it's starting at the target, you get the filters stacked up in reverse order: filter(target, 0, a, b, c, d) ends up as d_filter(c_filter(b_filter(a_filter(target And so naturally when invoked, they execute in reverse order. This seems I am surprised we have not run into this as a problem, but I believe most of my uses of filter in JRuby have been pure functions where order was not important (except for error conditions). Now in looking for a fix, I've run into the nasty workaround required to get filters to execute in the correct order: you have to reverse the filters, and then reverse the results again. This is far from desirable, since it requires at least one permute to put the results back in proper order. Is there a good justification for doing it this way, rather than having filterArguments start with the *last* filter nearest the target? - Charlie On Tue, Jan 2, 2018 at 2:17 PM Charles Oliver Nutter wrote: > Hello all, long time no write! > > I'm finally playing with writing a "compiler" for JRuby that uses only > method handles to represent code structure. For most simple expressions, > this obviously works well. However I'm having trouble with blocks of code > that contain multiple expressions. > > Starting with the standard call signature through the handle tree, we have > a basic (Object[])Object type. The Object[] contains local variable state > for the script, and will be as wide as there are local variables. AST nodes > are basically compiled into little functions that take in the variable > state and produce a value. In this way, every expression in the tree can be > compiled, including local variable sets and gets, loops, and so on. > > Now the tricky bit... > > The root node for a given script contains one or more expressions that > should be executed in sequence, with the final result being returned. The > way I'm handling this in method handles is as follows (invokebinder code > but hopefully easy to read): > > MethodHandle[] handles = > Arrays > .stream(rootNode.children()) > .map(node -> compile(node)) > .toArray(n -> new MethodHandle[n]); > > return Binder.from(Object.class, Object[].class) > .permute(new int[handles.length]) > .filter(0, handles) > .drop(0, handles.length - 1) > .identity(); > > In pseudo-code, this basically duplicates the Object[] as many times as > there are lines of code to execute, and then uses filterArguments to > evaluate each in turn. Then everything but the last result is culled and > the final result is returned. > > Unfortunately, this doesn't work right: filterArguments appears to execute > in reverse order. When I try to run a simple script like "a = 1; a" the "a" > value comes back null, because it is executed first. > > Is this expected? Do filters, when executed, actually process from the > last argument back, rather than the first argument forward? > > Note: I know this would be possible to do with guaranteed ordering using > the new loop combinators in 9. I'm working up to that for examples for a > talk. > > - Charlie > > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Writing a compiler to handles, but filter seems to executed in reverse
Hello all, long time no write! I'm finally playing with writing a "compiler" for JRuby that uses only method handles to represent code structure. For most simple expressions, this obviously works well. However I'm having trouble with blocks of code that contain multiple expressions. Starting with the standard call signature through the handle tree, we have a basic (Object[])Object type. The Object[] contains local variable state for the script, and will be as wide as there are local variables. AST nodes are basically compiled into little functions that take in the variable state and produce a value. In this way, every expression in the tree can be compiled, including local variable sets and gets, loops, and so on. Now the tricky bit... The root node for a given script contains one or more expressions that should be executed in sequence, with the final result being returned. The way I'm handling this in method handles is as follows (invokebinder code but hopefully easy to read): MethodHandle[] handles = Arrays .stream(rootNode.children()) .map(node -> compile(node)) .toArray(n -> new MethodHandle[n]); return Binder.from(Object.class, Object[].class) .permute(new int[handles.length]) .filter(0, handles) .drop(0, handles.length - 1) .identity(); In pseudo-code, this basically duplicates the Object[] as many times as there are lines of code to execute, and then uses filterArguments to evaluate each in turn. Then everything but the last result is culled and the final result is returned. Unfortunately, this doesn't work right: filterArguments appears to execute in reverse order. When I try to run a simple script like "a = 1; a" the "a" value comes back null, because it is executed first. Is this expected? Do filters, when executed, actually process from the last argument back, rather than the first argument forward? Note: I know this would be possible to do with guaranteed ordering using the new loop combinators in 9. I'm working up to that for examples for a talk. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Managing JNI resources in a classloader that might go away
Hey smart folks, I have a conundrum and Google is failing me. As you may know, we have been maintaining the Java Native Runtime libraries for providing FFI from Java pre-Panama. These libraries handle loading and binding C functions to Java endpoints automagically. Unfortunately, jffi -- the base library of the JNR stack -- has a few off-heap structures it allocates to support FFI calls. Those structures are generally held in static fields and cleaned up via finalization. This seems to be a somewhat fatal design flaw in situations where the classloader that started up jffi might go away long before the JVM shuts down. I've got a segfault, and all signs point toward it being a case of trying to call the JNI C code in jffi *after* the classloader has finalized and unloaded the library. The si_addr of the SIGSEGV and the top frame of the stack are the same address, which tells me that the segfault was caused by trying to call the JNI C code, which in this case is custom code to clean up those off-heap resources. I have found no easy answer to this problem. You can't tell when your classloader unloads, and as far as I can tell you can't tell that that the JNI library has gone away. And of course you can't guarantee finalization order. Sometimes, it works fine. But eventually, it fails. My logging of classloader finalization versus data freeing ends like this: classloader finalized 2014779152 freeing in 2014779152 freeing in 2014779152 I have not come up with any solution. These off-heap structures are tied to the lifecycle of the JNI backend, but there's no obvious way to clean them up just before the JNI backend gets unloaded. So, questions: 1. Does it seem like I'm on the right track? 2. Anyone have ideas for dealing with this? My best idea right now is to add a bunch of smarts to JNI_onunload that tidies everything up, rather than allowing finalization to do it at some indeterminate time in the future. 3. Why does JNI + classloading suck so bad? Frustratedly yours, - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Classloading glitch in publicLookup.findConstructor?
We've had a number of reports of LinkageError-related problems with a constructor handle acquired through publicLookup. Here's the commit that fixed the problem, with a link to a PR explaining the error: https://github.com/jruby/jruby/commit/32926ac194c03f0e61c0121e9da0b0427cfa5869 It seems like the error indicates a class is getting loaded into the bootstrap classloader during lookup when it should not, and as a result any child classloaders that load it later on have a conflicting copy. Thoughts? This is tested on a very recent Java 8. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Leaking LambdaForm classes?
On Fri, Jan 6, 2017 at 2:59 PM, Vladimir Ivanov < vladimir.x.iva...@oracle.com> wrote: > LambdaForm caches deliberately keep LF instances using SoftReferences. > > The motivation is: > (1) LFs are heavily shared; > (2) LFs are expensive to construct (LF interpreter is turned off by > default now); it involves the following steps: new LF instance + compile to > bytecode + class loading. > > So, keeping a LF instance for a while usually pays off, especially during > startup/warmup. There should be some heap/metaspace pressure to get them > cleared. > > As a workaround, try -XX:SoftRefLRUPolicyMSPerMB=0 to make soft references > behave as weak. I'll pass that along, thank you. I'm not sure how vigorously he's tried to get GC to clear things out. Not sure the problem relates to j.l.i & LFs since the report says indy in > jruby is turned off. For heavy usages of indy/j.l.i 1000s of LFs are > expected (<5k). The question is how does the count change over time. > JRuby has progressed to the point of using method handles and indy all the time, since for some cases the benefits are present without any issues. "Enabling" indy in JRuby mostly just turns on the use of indy for method call sites and instance variables now. That said, the numbers this user is reporting do seem really high, which is why I asked in here for similar stories. Even if we considered a very large Ruby application with many hundreds of files, we'd still see non-indy MH usages in JRuby in thousands at best (mostly for "constant" lookup sites). - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Leaking LambdaForm classes?
Anyone else encountered this? https://github.com/jruby/jruby/issues/4391 We have a user reporting metaspace getting filled up with LambdaForm classes that have no instances. I would not expect this to happen given that they're generated via AnonymousClassloader and we would need to hold a reference to them to keep them alive. I'm trying to get a heap dump from this user. If anyone has other suggestions, feel free to comment on the issue. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: series of switchpoints or better
On Wed, Oct 5, 2016 at 6:26 PM, Jochen Theodorou wrote: > There is one more special problem I have though: per instance meta > classes. So even if a x and y have the same class as per JVM, they can have > differing meta classes. Which means a switchpoint alone is not enough... > well, trying to get rid of that in the new MOP. JRuby also has per-instance classes (so-called "singleton classes"). We treat them like any other class. HOWEVER...if there's a singleton class that does not override any methods from the original class, it shares a SwitchPoint until such time that it is modified. I've also considered caching singleton classes of various shapes, so we can just choose based on known shapes...but never went further with that experiment. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: series of switchpoints or better
On Oct 5, 2016 17:43, "Jochen Theodorou" wrote: > I see... the problem is actually similar, only that I do not have to do something like that on a per "subclass added" event, but on a per "method crud operation" event. And instead of going up to check for a devirtualization, I have to actually propagate the change to all meta classes of subclasses... and interface implementation (if the change was made to an interface). So far I was thinking of making this lazy... but maybe I should actually mark the classes as "dirty" eagerly... sorry... not part of the discussion I guess ;) Oh I think it is certainly relevant! JRuby does this invalidation eagerly, but the cost can be high for changes to classes close to the root of the hierarchy. You have fewer guards at each call site, though. John's description of how Hotspot does this is also helpful; at least in JRuby, searching up-hierarchy for overridden methods is just a name lookup since Ruby does not overload. I've prototyped a similar system, with a SwitchPoint per method, but ran into some hairy class structures that made it complicated. The override search may be the answer for me. - Charlie (mobile) ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: EXT: Re: series of switchpoints or better
On Wed, Oct 5, 2016 at 1:36 PM, Jochen Theodorou wrote: > If I hear Remi saying volatile read... then it does not sound free to me > actually. In my experience volatile reads still present inlining barriers. > But if Remi and all of you tell me it is still basically free, then I will > not look too much at the volatile ;) > The volatile read is only used in the interpreter. In Groovy we use SwitchPoint as well, but only one for the whole meta class > system that could clearly improved it seems. Having a Switchpoint per > method is actually a very interesting approach I would not have considered > before, since it means creating a ton of Switchpoint objects. Not sure if > that works in practice for me since it is difficult to make a switchpoint > for a method that does not exist in the super class, but may come into > existence later on - still it seems I should be considering this. > I suspect Groovy developers are also less likely to modify classes at runtime? In Ruby, it's not uncommon to keep creating new classes or modifying existing ones at runtime, though it is generally discouraged (all runtimes suffer). > cold performance is a consideration for me as well though. The heavy > creation time of MethodHandles is one of the reasons we do not use > invokedynamic as much as we could... especially considering that creating a > new cache entry via runtime class generation and still invoking the method > via reflection is actually faster than producing one of our complex method > handles right now. > Creating a new cache entry via class generation? Can you elaborate on that? JRuby has a non-indy mode, but it doesn't do any code generation per call site. > As for Charles question: > >> Can you elaborate on the structure? JRuby has 6-deep (configurable) >> polymorphic caching, with each entry being a GWT (to check type) and a SP >> (to check modification) before hitting the plumbing for the method itself. >> > > right now we use a 1-deep cache with several GWT (check type and argument > types) and one SP plus several transformations. My goal is of course also > the 6-deep polymorphic caching in the end. Just motivation for this was not > so high before. If I use several SwitchPoint, then of course each of them > would be there for each cache entry. How many depends on the receiver type. > But at least one for each super class (and interface) > Ahh, so when you invalidate, you only invalidate one class, but every call site would have a SwitchPoint for the target class and all of its superclasses. That will be more problematic for cold performance than JRuby's way, but less overhead when invalidating. I'm not which trade-off is better. We also use this invalidation mechanism when calling dynamic methods from Java (since we also use call site caches there) but those sites are not (yet) guarded by a SwitchPoint. > To me horror I just found one pice of code commented with: > //TODO: remove this method if possible by switchpoint usage > With recent improvements to MH boot time and cold performance, I've started to use indy by default in more places, carefully measuring startup overhead along the way. I'm well on my way toward having fully invokedynamic-aware jitted code basically be all invokedynamics. > It is also good to hear that the old "once invalidated, it will not > optimized again - ever" is no longer valid. > And hopefully it will stay that way as long as we keep making noise :-) - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: series of switchpoints or better
Hi Jochen! On Wed, Oct 5, 2016 at 7:37 AM, Jochen Theodorou wrote: > > If the meta class for A is changed, all handles operating on instances of > A may have to reselect. the handles for B and Object need not to be > affected. If the meta class for Object changes, I need to invalidate all > the handles for A, B and Object. > This is exactly how JRuby's type-modification guards work. We've used this technique since our first implementation of indy call sites. > Doing this with switchpoints means probably one switchpoint per metaclass > and a small number of meta classes per class (in total 3 in my example). > This would mean my MethodHandle would have to get through a bunch of > switchpoints, before it can do the actual method invocation. And while > switchpoints might be fast it does not sound good to me. > >From what I've seen, it's fine as far as hot performance. Adding complexity to your handle chains likely impacts cold perf, of course. Can you elaborate on the structure? JRuby has 6-deep (configurable) polymorphic caching, with each entry being a GWT (to check type) and a SP (to check modification) before hitting the plumbing for the method itself. I will say that using SwitchPoints is FAR better than our alternative mechanism: pinging the (meta)class each time and checking a serial number. > Or I can do one switchpoint for all methodhandles in the system, which > makes me wonder if after a meta class change the callsite ever gets Jitted > again. The later performance penalty is actually also not very attractive > to me. > We have fought to keep the JIT from giving up on us, and I believe that as of today you can invalidate call sites forever and the JIT will still recompile them (within memory, code cache, and other limits of course). However, you'll be invalidating every call site for every modification. If the system eventually settles, that's fine. If it doesn't, you're going to be stuck with cold call site performance most of the time. > So what is the way to go here? Or is there an even better way? > I strongly recommend the switchpoint-per-class granularity (or finer, like switchpoint-per-class-and-method-name, which I am playing with now). - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ClassValue perf?
On Wed, May 6, 2015 at 6:36 PM, Jochen Theodorou wrote: > Charlie, did you ever get to writing some benchmarks? > Unfortunately not but we are getting into a performance phase over the next couple months. I'll see what I can come up with. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Getting back into indy...need a better argument collector!
Thanks Duncan. I will try to look under the covers this evening. - Charlie (mobile) On Jan 14, 2016 14:39, "MacGregor, Duncan (GE Energy Management)" < duncan.macgre...@ge.com> wrote: > On 11/01/2016, 11:27, "mlvm-dev on behalf of MacGregor, Duncan (GE Energy > Management)" duncan.macgre...@ge.com> wrote: > > >On 11/01/2016, 03:16, "mlvm-dev on behalf of Charles Oliver Nutter" > > > >wrote: > >... > >>With asCollector: 16-17s per iteration > >> > >>With hand-written array construction: 7-8s per iteration > >> > >>A sampling profile only shows my Ruby code as the top items, and an > >>allocation trace shows Object[] as the number one object being > >>created...not IRubyObject[]. Could that be the reason it's slower? > >>Some type trickery messing with optimization? > >> > >>This is very unfortunate because there's no other general-purpose way > >>to collect arguments in a handle chain. > > > >I haven¹t done any comparative benchmarks in that area for a while, but > >collecting a single argument is a pretty common pattern in the Magik code, > >and I had not seen any substantial difference when we last touched that > >area. However we are collecting to plain Object[] so it might be that is > >the reason for the difference. If I¹ve got time later this week I¹ll do > >some experimenting and check what the current situation is. > > Okay, I’ve now had a chance to try this in with our language benchmarks > and can’t see any significant difference between a hand crafted method and > asCOllector, but we are dealing with Object and Object[], so it might be > something to do with additional casting. > > Duncan. > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Getting back into indy...need a better argument collector!
Hello folks! Now that we're a few months into JRuby 9000 I've started to hack on the indy bindings again. Things are looking good so far. I'm working on getting closures to inline where they're invoked by chaining together a number of GWT just like a polymorphic call site. Anyway, my discovery today was that it's too expensive to collect a bunch of arguments at the end of an argument list right now. For one place in closure dispatch, we need to box a single incoming argument in an array. In InvokeBinder code, that looks like this: ``` Binder.from(IRubyObject[].class, IRubyObject.class).collect(0, IRubyObject[].class).identity(); ``` Since there's only a single argument and we're boxing to the end of the list, this turns into a handle.asCollector followed by an identity. Unfortunately, it's MUCH faster to just bind to a hand-written method that constructs the array directly. ``` Binder.from(IRubyObject[].class, IRubyObject.class) .invokeStaticQuiet(MethodHandles.lookup(), CompiledIRBlockBody.class, "wrapValue"); private static IRubyObject[] wrapValue(IRubyObject value) { return new IRubyObject[] {value}; } ``` I was running a benchmark that calls a method with a closure 10M times, and the method yields back to the closure 25 times, so a total of 250M closure dispatches passing through the above adapter (among others). Everything seems to fit together nicely, though I haven't checked inlining yet. With asCollector: 16-17s per iteration With hand-written array construction: 7-8s per iteration A sampling profile only shows my Ruby code as the top items, and an allocation trace shows Object[] as the number one object being created...not IRubyObject[]. Could that be the reason it's slower? Some type trickery messing with optimization? This is very unfortunate because there's no other general-purpose way to collect arguments in a handle chain. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
ClassValue rooting objects after it goes away?
Pardon me if this has been discussed before, but we had a bug (with fix) reported today that seems to indicate that the JVM is rooting objects put into a ClassValue even if the ClassValue goes away. Here's the pull request: https://github.com/jruby/jruby/pull/3228 And here's one example of the root trace leading back to our JRuby runtime. All the roots appear to be VM-level code: https://dl.dropboxusercontent.com/u/9213410/class-values-leak.png Is this expected? If we have to stuff a WeakReference into the ClassValue it seriously diminishes its utility to us. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ClassValue perf?
On Mon, Apr 27, 2015 at 12:50 PM, Jochen Theodorou wrote: > Am 27.04.2015 19:17, schrieb Charles Oliver Nutter: >> Jochen: Is your class-to-metaclass map usable apart from the Groovy >> codebase? > > > Yes. Look for org.codehaus.groovy.reflection.GroovyClassValuePreJava7 which > is normally wrapped by a factory. Excellent, thank you! - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ClassValue perf?
On Wed, Apr 29, 2015 at 4:02 AM, Doug Simon wrote: > We considered using ClassValue in Graal for associating each Node with its > NodeClass. Accessing the NodeClass is a very common operation in Graal (e.g., > it’s used to iterate over a Node’s inputs). However, brief experimentation > showed implementing this with ClassValue performed significantly worse than a > direct field access[1]. We currently use ClassValue to link Class values with > their Graal mirrors. Accessing this link is infrequent enough that the > performance trade off against injecting a field to java.lang.Class[2] is > acceptable. That's what I'm banking on too. My case is similar to Groovy's: I need a way to *initially* get the metaclass for a given JVM class. Unlike Groovy, however, we still have to wrap Java objects in a JRuby-aware wrapper, so subsequent accesses of the class via that object are via a plain field. So the impact of ClassValue will mostly be at the border between Ruby and Java, when we need to initially build that wrapper and put some metaclass in it. Of course the disadvantage of the wrapper is the wrapper itself. If we could inject our IRubyObject interface into java.lang.Object my life would be much better. But I digress. > The memory footprint improvement suggested in JDK-8031043 would still help. I'll have to take a look at that. We're pretty memory-sensitive since Ruby's already fairly heap-intensive. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: ClassValue perf?
It seems I may have to write some benchmarks for this then. Just so I understand, the equivalent non-ClassValue-based store would need to: * Be atomic; value may calculate more than once but only be set once. * Be weak; classes given class values must not be rooted as a result (an external impl like in JRuby or Groovy would have to use weak maps for this). Jochen: Is your class-to-metaclass map usable apart from the Groovy codebase? - Charlie On Mon, Apr 27, 2015 at 11:40 AM, Christian Thalinger wrote: > > On Apr 24, 2015, at 2:17 PM, John Rose wrote: > > On Apr 24, 2015, at 5:38 AM, Charles Oliver Nutter > wrote: > > > Hey folks! > > I'm wondering how the performance of ClassValue looks on recent > OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is > one place I'd like to simplify our code a bit. > > I could measure myself, but I'm guessing some of you have already done > a lot of exploration or have benchmarks handy. So, what say you? > > > I'm listening too. We don't have any special optimizations for CVs, > and I'm hoping the generic code is a good-enough start. > > > A while ago (wow; it’s more than a year already) I was working on: > > [#JDK-8031043] ClassValue's backing map should have a smaller initial size - > Java Bug System > > and we had a conversation about it: > > http://mail.openjdk.java.net/pipermail/mlvm-dev/2014-January/005597.html > > It’s not about performance directly but it’s about memory usage and maybe > the one-value-per-class optimization John suggests is in fact a performance > improvement. Someone should pick this one up. > > — John > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
ClassValue perf?
Hey folks! I'm wondering how the performance of ClassValue looks on recent OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is one place I'd like to simplify our code a bit. I could measure myself, but I'm guessing some of you have already done a lot of exploration or have benchmarks handy. So, what say you? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: IntelliJ debugger?
I have never gotten either Netbeans' or IntelliJ's debuggers to step through source without a .java extension (speaking specifically of Ruby code, even if source dirs and JSR-45 stuff are in place). I can easily get jdb to step through any language's source, so I know it's not a problem with how I'm emitting debug info. - Charlie On Fri, Apr 3, 2015 at 1:51 AM, Dain Sundstrom wrote: > So I did a bunch more testing, and this is what I found: > > - The IntelliJ debugger ignores the “source” declaration in the class file > and instead always looks for a “.java” file in the source path > - The file must contain a java class declaration with the same name > - The file must be “recognized” by IntelliJ before the debugger stops, so you > can’t dynamically generate a bogus java file > - If the file is not present, the debugger will not show local variables > - The debugger seems to ignore local variable type declarations, so the > “Evaluate Expressions” window does not get type ahead (but works otherwise). > > I might try adding a JSR-45 SMAP, but I don’t have high hopes based on > Charlie’s comments at the last summit. > > Does anyone else have any ideas on things that might work? > > -dain > > On Apr 1, 2015, at 11:08 PM, Dain Sundstrom wrote: > >> Hi all, >> >> I think this might have been asked before... Has anyone gotten the intelliJ >> debugger to step through the source file for their language? >> >> Adding the source and line numbers during generation makes stack traces to >> come out correctly, and Intellij even opens the correct file location. >> During debugging, I can see the correct correct source and line numbers, but >> intellij doesn’t open the file. >> >> I’d even be ok with a hacky solution where I rename all of my files to be >> “x.java”. >> >> Thanks, >> >> -dain > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lost perf between 8u40 and 9 hs-comp
Ok, now we're cracking! Performance has definitely returned, and actually improved 15-20% beyond my current copy of 8u40. Bravo! I will try testing several other benchmarks, and perhaps set up a machine to do the big perf regression suite the JRuby+Truffle guys made for us. FWIW, the additional "Per" flags did not appear to help performance, and actually seemd to degrade it almost back to where 8u40 lies. - Charlie On Fri, Mar 6, 2015 at 7:06 AM, Vladimir Ivanov wrote: > John, > > You are absolutely right. I should've spent more time exploring the code > than writing emails :-) > > Here's the fix: > http://cr.openjdk.java.net/~vlivanov/8074548/webrev.00/ > > Charlie, I'd love to hear your feedback on it. It fixes the regression on > bench_red_black.rb for me. > > Also, please, try -XX:PerBytecodeRecompilationCutoff=-1 > -XX:PerMethodRecompilationCutoff=-1 (to workaround another problem I spotted > [1]). > > On 3/4/15 5:16 AM, John Rose wrote: >> >> On Mar 3, 2015, at 3:21 PM, Vladimir Ivanov >> wrote: >>> >>> >>> Ah, I see now. >>> >>> You suggest to conditionally insert uncommon trap in MHI.profileBoolean >>> when a count == 0, right? >>> >>> Won't we end up with 2 checks if VM can't fold them (e.g. some action in >>> between)? >> >> >> Maybe; that's the weak point of the idea. The VM *does* fold many >> dominating ifs, as you know. >> >> But, if the profileBoolean really traps on one branch, then it can return >> a *constant* value, can't it? >> >> After that, the cmps and ifs will fold up. > > Brilliant idea! I think JIT can find that out itself, but additional help > always useful. > > The real weak point IMO is that we need to keep MHI.profileBoolean intrinsic > and never-taken branch pruning logic during parsing (in parse2.cpp) to keep > in sync. Otherwise, if VM starts to prune rarely taken branches at some > point, we can end up in the same situation. > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8074551 > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: What can we improve in JSR292 for Java 9?
On Thu, Feb 26, 2015 at 4:27 AM, Jochen Theodorou wrote: > my biggest request: allow the call of a super constructor (like > super(foo,bar)) using MethodHandles an have it understood by the JVM like a > normal super constructor call... same for this(...) Just so I understand...the problem is that unless you can get a Lookup that can do the super call from Java (i.e. from within a subclass), you can't get a handle that can do the super call, right? And you can't do that because the method bodies might not be emitted into a natural subclass of the super class? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: What can we improve in JSR292 for Java 9?
Busy week, finally circling back to this thread... On Wed, Feb 25, 2015 at 8:29 PM, John Rose wrote: >> * A loop handle :-) >> >> Given a body and a test, run the body until the test is false. I'm >> guessing there's a good reason we don't have this already. > > A few reasons: 1. You can code your own easily. I can't code one that will specialize for every call path, though, unless I generate a new loop body for every call path. > 2. There's no One True Loop the way there is a One True If. > The "run until test is false" model assumes all the real work is > done with side-effects, which are off-center from the MH model. This I can appreciate. My mental model of MHs started to trend toward a general-purpose IR, and I believe if I had some sort of backward branch it could be that. But I understand if that's the wrong conceptual model, and I realize now that MHs are basically call stack adapters with a bit of forward branching thrown in. It does feel like there's a need for better representation of branch-joining or phi or whatever you want to call it, though. > 3. A really clean looping mechanism probably needs a sprinkle > of tail call optimization. > > I'm not saying that loops should never have side effects, but I > am saying that a loop mechanism should not mandate them. > > Maybe this is general enough: > > MHs.loop(init, predicate, body)(*a) > => { let i = init(*a); while (predicate(i, *a)) { i = body(i, *a); } > return i; } > > ...where the type of i depends on init, and if init returns void then you > have a classic side-effect-only loop. Ahh yes, this makes sense. If it were unrolled, it would simply be a series of folds and drops as each iteration through the body modified the condition in some way. So then we just need it to work without unrolling. My silly use case for this would be to emit simple expressions entirely as a MH chain, so we'd get the benefit of MH optimizations without generating our own bytecode (and with forced inlining and perhaps a richer semantic representation than just bytecode). It's not a very compelling case, of course, since I could just emit bytecode too. >> * try/finally as a core atom of MethodHandles API. >> >> Libraries like invokebinder provide a shortcut API To generating the >> large tree of handles needed for try/finally, but the JVM may not be >> able to optimize that tree as well as a purpose-built adapter. > > I agree there. We should put this in. > >MHs.tryFinally(target, cleanup)(*a) > => { try { return target(*a); } finally { cleanup(*a); } } > > (Even here there are non-universalities; what if the cleanup > wants to see the return value and/or the thrown exception? > Should it take those as one or two leading arguments?) In InvokeBinder, the finally is expected to require no additional arguments compared to the try body, since that was the use case I needed. You bring up a good point...and perhaps the built-in JSR292 tryFinally should take *two* handles: one for the exceptional path (with exception in hand) and one for the non-exceptional path (with return value in hand)? The exceptional path would be expected to return the same type as the try body or re-raise the exception. The non-exceptional path would be expected to return void. > We now have MHs.collectArguments. Do you want MHs.spreadArguments > to reverse the effect? Or is there something else I'm missing? I want to be able to group arguments at any position in the argument list. Example: Incoming signature: (String foo, int a1, int a2, int a3, Object obj) Target signature: (String foo, int[] as, Object obj) ... without permuting arguments > Idea of the day: An ASM-like library for method handles. > Make a MethodHandleReader which can run a visitor over the MH. > The ops of the visitor would be a selection of public MH operations > like filter, collect, spread, lookup, etc. > Also ASM-like, the library would have a MethodHandleWriter > would could be hooked up with the reader to make filters. That would certainly cover it! I'd expect this to add: * MethodHandleVisitor interface of some kind * MethodHandle#accept(MethodHandleVisitor) or similar * MethodHandleType enum with all the base MH types (so we're not forcing all types into a static interface). - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Lost perf between 8u40 and 9 hs-comp
Thanks for looking into it Vladimir...I'm standing by to test out anything! - Charlie On Tue, Mar 3, 2015 at 10:23 AM, Vladimir Ivanov wrote: > John, > >> So let's make hindsight work for us: Is there a way (either with or >> without the split you suggest) to more firmly couple the update to the >> query? Separating into two operations might be the cleanest way to go, but >> I think it's safer to keep both halves together, as long as the slow path >> can do the right stuff. >> >> Suggestion: Instead of have the intrinsic expand to nothing, have it >> expand to an uncommon trap (on the slow path), with the uncommon trap doing >> the profile update operation (as currently coded). > > Right now, VM doesn't care about profiling logic at all. The intrinsic is > used only to inject profile data and all profiling happens in Java code. > Once MHI.profileBoolean is intrinsified (profile is injected), no profiling > actions are performed. > > The only way I see is to inject count bump on pruned branch before issuing > uncommon trap. Alike profile_taken_branch in Parse::do_if, but it should not > update MDO, but user-specified int[2]). > > It looks irregular and spreads profiling logic between VM & Java code. But > it allows to keep single entry point between VM & Java (MHI.profileBoolean). > > I'll prototype it to see how does it look like on the code level. > > Best regards, > Vladimir Ivanov > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
What can we improve in JSR292 for Java 9?
After talking with folks at the Jfokus VM Summit, it seems like there's a number of nice-to-have and a few need-to-have features we'd like to see get into java.lang.invoke. Vladimir suggested I start a thread on these features. A few from me: * A loop handle :-) Given a body and a test, run the body until the test is false. I'm guessing there's a good reason we don't have this already. * try/finally as a core atom of MethodHandles API. Libraries like invokebinder provide a shortcut API To generating the large tree of handles needed for try/finally, but the JVM may not be able to optimize that tree as well as a purpose-built adapter. * Argument grouping operations in the middle of the argument list. JRuby has many signatures that vararg somewhere other than the end of the argument list, and the juggling required to do that logic in handles is complex: shift to-be-boxed args to end, box them, shift box back. Another point about these more complicated forms: they're ESPECIALLY slow early in execution, before LFs have been compiled to bytecode. * Implementation-specific inspection API. I know there are different ways to express a MH tree on different JVMs (e.g. J9) but it would still be a big help for me if there were a good way to get some debug-time structural information about a handle I'm using. Hidden API would be ok if it's not too hidden :-) That's off the top of my head. Others? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Lost perf between 8u40 and 9 hs-comp
I'm finally at home with a working machine so I can follow up on some VM Summit to-dos. Vladimir wanted me to test out jdk9 hs-comp, which has all his latest work on method handles. I wish I could report that performance looks great, but it doesn't. Here's timing (in s) of our red/black benchmark on JRuby 1.7.19, first on the latest (as of today) 8u40 snapshot build and then on a minutes-old jdk9 hs-comp build: ~/projects/jruby $ (pickjdk 4 ; rvm jruby-1.7.19 do ruby -Xcompile.invokedynamic=true ../rubybench/time/bench_red_black.rb 10) New JDK: jdk1.8.0_40.jdk 5.206 2.497 0.69 0.703 0.72 0.645 0.698 0.673 0.685 0.67 ~/projects/jruby $ (pickjdk 5 ; rvm jruby-1.7.19 do ruby -Xcompile.invokedynamic=true ../rubybench/time/bench_red_black.rb 10) New JDK: jdk1.9_hs-comp 5.048 3.773 1.836 1.474 1.366 1.394 1.249 1.399 1.352 1.346 Perf is just about 2x slower on jdk9 hs-comp. I tried out a few other benchmarks, which don't seem to have as much variation: * recursive fib(35): equal perf * mandelbrot: jdk8u40 5% faster * protobuf: jdk9 5% faster The benchmarks are in jruby/rubybench on Github. JRuby 1.7.19 can be grabbed from jruby.org or built from jruby/jruby (see BUILDING.md). Looking forward to helping improve this :-) - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Invokedynamic and recursive method call
This could explain performance regressions we've seen on the performance of heavily-recursive algorithms. I'll try to get an assembly dump for fib in JRuby later today. - Charlie On Wed, Jan 7, 2015 at 10:13 AM, Remi Forax wrote: > > On 01/07/2015 10:43 AM, Marcus Lagergren wrote: >> >> Remi, I tried to reproduce your problem with jdk9 b44. It runs decently >> fast. > > > yes, nashorn is fast enough but it can be faster if the JIT was not doing > something stupid. > > When the VM inline fibo, because fibo is recursive, the recursive call is > inlined only once, > so the call at depth=2 can not be inlined but should be a classical direct > call. > > But if fibo is called through an invokedynamic, instead of emitting a direct > call to fibo, > the JIT generates a code that push the method handle on stack and execute it > like if the metod handle was not constant > (the method handle is constant because the call at depth=1 is inlined !). > >> When did it start to regress? > > > jdk7u40, i believe. > > I've created a jar containing some handwritten bytecodes with no dependency > to reproduce the issue easily: > https://github.com/forax/vmboiler/blob/master/test7/fibo7.jar > > [forax@localhost test7]$ time /usr/jdk/jdk1.9.0/bin/java -cp fibo7.jar > FiboSample > 1836311903 > > real0m6.653s > user0m6.729s > sys0m0.019s > [forax@localhost test7]$ time /usr/jdk/jdk1.8.0_25/bin/java -cp fibo7.jar > FiboSample > 1836311903 > > real0m6.572s > user0m6.591s > sys0m0.019s > [forax@localhost test7]$ time /usr/jdk/jdk1.7.0_71/bin/java -cp fibo7.jar > FiboSample > 1836311903 > > real0m6.373s > user0m6.396s > sys0m0.016s > [forax@localhost test7]$ time /usr/jdk/jdk1.7.0_25/bin/java -cp fibo7.jar > FiboSample > 1836311903 > > real0m4.847s > user0m4.832s > sys0m0.019s > > as you can see, it was faster with a JDK before jdk7u40. > >> >> Regards >> Marcus > > > cheers, > Rémi > > >> >>> On 30 Dec 2014, at 20:48, Remi Forax wrote: >>> >>> Hi guys, >>> I've found a bug in the interaction between the lambda form and inlining >>> algorithm, >>> basically if the inlining heuristic bailout because the method is >>> recursive and already inlined once, >>> instead to emit a code to do a direct call, it revert to do call to >>> linkStatic with the method >>> as MemberName. >>> >>> I think it's a regression because before the introduction of lambda >>> forms, >>> I'm pretty sure that the JIT was emitting a direct call. >>> >>> Step to reproduce with nashorn, run this JavaScript code >>> function fibo(n) { >>> return (n < 2)? 1: fibo(n - 1) + fibo(n - 2) >>> } >>> >>> print(fibo(45)) >>> >>> like this: >>> /usr/jdk/jdk1.9.0/bin/jjs -J-XX:+UnlockDiagnosticVMOptions >>> -J-XX:+PrintAssembly fibo.js > log.txt >>> >>> look for a method 'fibo' from the tail of the log, you will find >>> something like this: >>> >>> 0x7f97e4b4743f: mov$0x76d08f770,%r8 ; {oop(a >>> 'java/lang/invoke/MemberName' = {method} {0x7f97dcff8e40} 'fibo' >>> '(Ljdk/nashorn/internal/runtime/ScriptFunction;Ljava/lang/Object;I)I' in >>> 'jdk/nashorn/internal/scripts/Script$Recompilation$2$fibo')} >>> 0x7f97e4b47449: xchg %ax,%ax >>> 0x7f97e4b4744b: callq 0x7f97dd0446e0 >>> >>> I hope this can be fixed. My demonstration that I can have fibo written >>> with a dynamic language >>> that run as fast as written in Java doesn't work anymore :( >>> >>> cheers, >>> Rémi >>> >>> ___ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Truffle and mlvm
world” as close as possible. > > There is absolutely no reason to believe that a Truffle-based Ruby > > implementation would not have benefits for “real-world applications”. Or > > that it would not be able to run a large application for a long time. It > is > > clear that the TruffleRuby prototype needs more completeness work both at > > the language and the library level. We are very happy with the results we > > got so far with Chris working for about a year. We are planning to > increase > > the number of people working on this, and would also be grateful for any > > help we can get from the Ruby community. > > > > Regarding Graal: Did you ever try to benchmark JRuby without Truffle > with > > the latest Graal binaries available at > > http://lafo.ssw.uni-linz.ac.at/builds/? We would be looking forward to > > see the peak performance results on a couple of workloads. We are not > > speculating about Graal becoming part of a particular OpenJDK release (as > > experimental or regular option). This is the sovereign decision of the > > OpenJDK community. All we can do is to demonstrate and inform about > Graal’s > > performance and stability. > > > > We recognise that there is a long road ahead. But in particular in this > > context, I would like to emphasize that we are looking for more people to > > support this effort for a new language implementation platform. I > strongly > > believe that Truffle is the best currently available vehicle to make Ruby > > competitive in terms of performance with node.js. We are happy to try to > > *prove* you wrong - even happier about support of any kind along the road > > ;). I am also looking forward to continue this discussion at JavaOne (as > > part of the TruffleRuby session or elsewhere). > > > > Regards, thomas > > > > On 30 Aug 2014, at 21:21, Charles Oliver Nutter > > wrote: > > > > > Removing all context, so it's clear this is just my opinions and > > thoughts... > > > > > > As most of you know, we've opened up our codebase and incorporated the > > > graciously-donated RubyTruffle directly into JRuby. It's available on > > > JRuby master and we are planning to ship Truffle support with JRuby > > > 9000, our next major version (due out in the next couple months). > > > > > > At the same time, we have been developing our own next-gen IR-based > > > compiler, which will run unmodified on any JVM (with or without > > > invokedynamic, though I still have to implement the "without" side). > > > Why are we doing this when Truffle shows such promise? > > > > > > I'll try to enumerate the benefits and problems of Truffle here. > > > > > > * Benefits of using Truffle > > > > > > 1. Simpler implementation. > > > > > > From day 1, the most obvious benefit of Truffle is that you just have > > > to write an AST interpreter. Anyone who has implemented a programming > > > language can do this easily. This specific benefit doesn't help us > > > implement JRuby, since we already have an AST interpreter, but it did > > > make Chris Seaton's job easier building RubyTruffle initially. This > > > also means a Truffle-based language is more approachable than one with > > > a complicated compiler pipeline of its own. > > > > > > 2. Better communication with the JIT. > > > > > > Truffle, via Graal, has potential to pass much more information on to > > > the JIT. Things like type shape, escaped references, frame access, > > > type specialization, and so on can be communicated directly, rather > > > than hoping and praying they'll be inferred by the shape of bytecodes. > > > This is probably the largest benefit; much of my time optimizing JRuby > > > has been spend trying to "trick" C2 into doing the right thing, since > > > I don't have a direct way to communicate intent. > > > > > > The peak performance numbers for Truffle-based languages have been > > > extremely impressive. If it's possible to get those numbers reasonably > > > quickly and with predictable steady-state behavior in large, > > > heterogeneous codebases, this is definitely the quickest path (on any > > > runtime) to a high-performance language implementation. > > > > > > 3. OSS and pure Java > > > > > > Truffle and Graal are just OpenJDK projects under OpenJDK licenses, > > > and anyone can build, hack, or distribute them. In addition, both >
Re: The Great Startup Problem
On Mon, Sep 1, 2014 at 2:07 AM, Vladimir Ivanov wrote: > Stack usage won't be constant though. Each compiled LF being executed > consumes 1 stack frame, so for a method handle chain of N elements, it's > invocation consumes ~N stack frames. > > Is it acceptable and solves the problem for you? This is acceptable for JRuby. Our worst-case Ruby method handle chain will include at most: * Two CatchExceptions for pre/post logic (heap frames, etc). Perf of CatchException compared to literal Java try/catch is important here. * Up to two permute arguments for differing call site/target argument ordering. * Varargs negotiation (may be a couple handles) * GWT * SwitchPoint * For Ruby to Java calls, each argument plus the return value must be filtered to convert to/from Ruby types or apply an IRubyObject wrapper This is worst case, mind you. Most calls in the system will be arity-matched, eliminating the permutes. Most calls will be three or fewer arguments, eliminating varargs. Many calls will be optimized to no longer need a heap frame, eliminating the try/finally. The absolute minimum for any call would be SwitchPoint plus GWT. Of course I'm not counting DMHs here, since they're either the call we want to make or they're leaf logic. > We discussed an idea to generate custom bytecodes (single method) for the > whole method handle chain (and have only 1 extra stack frame per MH > invocation), but it defeats memory footprint reduction we are trying to > archieve with LambdaForm sharing. Funny thing...because indy slows our startup and increases our warmup time, we're using our old binding logic by default. And surprise surprise, our old binding logic does exactly this...one small generated invoker class per method. I'm sure you're right that this approach defeats the sharing and memory reduction we'd like to see from LFs, but it works *really* well if you're ok with the extra class and metaspace data in memory. So there's one question: is the cost of a bytecoded adapter shim for each method object really that high? Yes, if you're spinning new MHs constantly or doing a million different adaptations of a given method. But if you're just lazily creating an invoker shim once per method, that really doesn't seem like a big deal. My indy binding logic also has a dozen different flags for tweaking. I can easily modify it to avoid doing all that pre/post logic and argument permutation in the MH chain and just bind directly to the generated invoker. Best (or worst) of both worlds? I just really don't want to have to do that...I want everything from call site to target method body to be in the MH chain. For JRuby 9000, all try/finally logic will be within the target method, so at least that part of the MH chain goes away. Here's another idea... We've been using my InvokeBinder library heavily in JRuby. It provides a Java API/DSL for creating MH chains lazily from the top down: MethodHandle mh = Binder.from(String.class, Object.class, Float.class) .tryFinally(finallyLogic) .permute(1, 0) .append("Hello") .drop(1) .invokeStatic(MyClass.class, "someMethod"); The adaptations are gathered within the Binder instance, playing forward as you add adaptations and played backward at binding time to make the appropriate MethodHandles and MethodHandle calls. Duncan talked about how he was able to improve MH chain size and performance by applying certain transformations in a different order, among other things. InvokeBinder *could* be doing a lot more to optimize the MH chain. For example, the above case never uses the Object value passed in (it is permuted to position 1 and later dropped), but that fact is obscured by the intervening append. InvokeBinder is basically doing with MHs what MHs do with LFs. Perhaps what we really need is a more holistic view of MH + LF operations *together* so we can boil the whole thing down (even across MH lines) before we start interpreting or compiling it? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Class hierarchy analysis and CallSites
On Mon, Sep 1, 2014 at 8:46 AM, MacGregor, Duncan (GE Energy Management) wrote: > Has anybody else tried doing this sort of thing as part of their > invokeDynamic Implementation? I’m curious if anybody has data comparing the > speed of GWT & class comparison based PICs with checks that require getting a > ClassValue and doing a Map or Set lookup? I've thought about trying this. Here's the short version of JRuby's class hierarchy + lookup mechanism: * Each class has a map of methods and a lookup cache. The method map only contains methods from that class, but the lookup cache (which all lookups pass through) may eventually hold all methods from superclasses as well. This is class-level method caching. * Call sites look up method on the receiver's natural class, which will populate an entry in the class-level lookup cache if none is there. * The lookup cache entries are a tuple of class serial number + method object. The serial number represents the class's version at lookup time. * When any method table changes, the serial numbers of all child classes get bumped, so their already-cached lookup entries are now invalid. Upon next lookup, a stale entry will be replaced. On the call site guard side, we have used both serial number comparison and class reference comparison. Using the serial number has a couple: no hard references to classes in the call site, classes that are identical can be made to look identical by reusing serial number, etc. The disadvantage is that it requires an additional dereference to get the current serial number off incoming classes every time. For class comparison, we dereference the metaclass field on the object to get a RubyClass object reference, store that at the call site, and do direct referential comparisons in the call site guard. I should note that most objects in JRuby are of the same JVM type. We generally don't stand up a JVM class for every Ruby class, since Ruby classes are sometimes (frequently?) created and thrown away as part of normal execution. This could be considered a sort of CHA, since the serial number indicates not just the version of the class, but the version of all its ancestors. It could be improved, however, to be a calculated value based on the actual shape of the class, so two different classes with the same superclass and no methods of their own would look the same to the guard. I have only done basic experiments here. - Charlie > > Duncan. > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Truffle and mlvm
Removing all context, so it's clear this is just my opinions and thoughts... As most of you know, we've opened up our codebase and incorporated the graciously-donated RubyTruffle directly into JRuby. It's available on JRuby master and we are planning to ship Truffle support with JRuby 9000, our next major version (due out in the next couple months). At the same time, we have been developing our own next-gen IR-based compiler, which will run unmodified on any JVM (with or without invokedynamic, though I still have to implement the "without" side). Why are we doing this when Truffle shows such promise? I'll try to enumerate the benefits and problems of Truffle here. * Benefits of using Truffle 1. Simpler implementation. >From day 1, the most obvious benefit of Truffle is that you just have to write an AST interpreter. Anyone who has implemented a programming language can do this easily. This specific benefit doesn't help us implement JRuby, since we already have an AST interpreter, but it did make Chris Seaton's job easier building RubyTruffle initially. This also means a Truffle-based language is more approachable than one with a complicated compiler pipeline of its own. 2. Better communication with the JIT. Truffle, via Graal, has potential to pass much more information on to the JIT. Things like type shape, escaped references, frame access, type specialization, and so on can be communicated directly, rather than hoping and praying they'll be inferred by the shape of bytecodes. This is probably the largest benefit; much of my time optimizing JRuby has been spend trying to "trick" C2 into doing the right thing, since I don't have a direct way to communicate intent. The peak performance numbers for Truffle-based languages have been extremely impressive. If it's possible to get those numbers reasonably quickly and with predictable steady-state behavior in large, heterogeneous codebases, this is definitely the quickest path (on any runtime) to a high-performance language implementation. 3. OSS and pure Java Truffle and Graal are just OpenJDK projects under OpenJDK licenses, and anyone can build, hack, or distribute them. In addition, both Truffle and Graal are 100% Java, so for the first time a plain old Java developer can see (and manipulate) exactly how the JIT works without getting lost in a sea of plus plus. * Problems with Truffle I want to emphasize that regardless of its warts, we love Truffle and Graal and we see great potential here. But we need a dose of reality once in a while, too. 1. AST is not enough. In order to make that AST fly, you can't just implement a dumb generic interpreter. You need to know about (and generously annotate your AST for) many advanced compiler optimization techniques: A. Type specialization plus guarded fallbacks: Truffle will NOT specialize your code for you. You must provide every specialized path in your AST nodes as well as annotating "slow path", "transfer to interpreter", etc. B. Frame access and reification: In order to have cross-call access to frames or to squash frames created for multiple inlined calls, you must use Truffle's representation of a frame. This means loads/stores within your AST must be done against a Truffle object, not against an arbitrary object of your own creation. C. Method invocation and inlining: Up until fairly recently, if you wanted to inline methods you had to essentially build your own call site logic, profiling, deopt paths within your Truffle AST. When I did a little hacking on RubyTruffle around OSS time (December/January) it did *no* inlining of Ruby-to-Ruby calls. I hacked in inlining using existing classes and managed to get it to work, but I was doing all the plumbing myself. I know this has improved in the Truffle codebase since then, but I have my concerns about production readiness when the inlining call site parts of Truffle were just recently added and are still in flux. And there's plenty of other cases. Building a basic language for Truffle is pretty easy (I did a micro-language in about two hours at JVMLS last year), but building a high-performance language for Truffle still takes a fair investment of effort and working knowledge of dynamic compiler optimizations. 2. Long startup and warmup times. As Thomas pointed out in the other thread, because Truffle and Graal are normally run as plain Java libraries, they can actually aggravate startup time issues. Now, not only would all of JRuby have to warm up, but the eventual native code JIT has to warm up too. This is not surprising, really. It is possible to mitigate this by doing some form of AOT against Graal, but for every case I have seen the Truffle/Graal approach makes startup time much, much worse compared to just running atop JVM. Warmup time is also worsened significantly. The AST you create for Truffle must be heavily mutated while running in order to produce a specialized version of that AST. This must happen before the AST is eventually fed
Re: Defining anonymous classes
On Fri, Aug 15, 2014 at 5:39 PM, John Rose wrote: > If the host-class token were changed to a MethodHandles.Lookup object, we > could restrict the host-class to be one which the user already had > appropriate access to. Seems simple, but of course the rest of the project > is complicated: API design, spec completion, security analysis, positive > and negative test creation, code development, quality assurance—all these > would be expensive, and (again) most easily justified in the context of a > larger refresh of our classfile format. Sounds like a good candidate to be a standalone project first. I think we have the right people on this list to do it (and many of us have already done large portions of that work on our own already). > Or, most or all of dAC could be simulated using regular class loading, into a > single-use ClassLoader object. The nominal bytecodes would have to be > rewritten to use invokedynamic to manage the linking, at least to host-class > names. But given that ASM is inside the JDK, the tools are all available. > (Remi could do most of it in an afternoon. :-) ) Given such a simulation, > the internal dAC mechanism could be used as an optimization, when available, > but there would be a standard (complex) semantics derived from ordinary > classes and indy. This is how JRuby has survived for years. A classloader-per-class has a big memory load (ClassLoader has a lot of internal state, classes have a lot of metadata) but with permgen bumped up (or replaced with metaspace as in 8) and a few reuse tricks, it hasn't been a major issue for us. In JRuby, the following are all generated into single-shot classloaders: * Ruby methods JIT-compiled to bytecode at runtime * Wrapper logic ("invokers") around AOT or JIT-compiled method and closure bodies (including core methods written in Java) * Synthetic interface implementations and subclasses (for implementing or extending from Ruby) * Most other one-off pieces of bytecode we generate at runtime. We almost never need to look up those classes once created and instantiated. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: The Great Startup Problem
On Mon, Aug 25, 2014 at 6:59 AM, Fredrik Öhrström wrote: > Calle Wilund and I implemented such a indy/methodhandle solution for > JRockit, so I know it works. You can see a demonstration here: > http://medianetwork.oracle.com/video/player/589206011001 That > implementations jump to C-code that performed the invoke call, no fancy > optimizations. Though the interpreter implementation of invoke can be > optimized as well, that was the first half of the talk is about. But its > really not that important for speed, because the speed comes from inlining > the invoke call chain as early as possible after detecting that an indy is > hot. But can it work in C2? :-) My impression of C2 is that specialization isn't in the list of things it does well. If we had a general-purpose specialization mechanism in Hotspot, things would definitely be a *lot* easier. We might not even need indy...just write Java code that does all your MH translations and specialize it to the caller's call site. We can certainly get C2 to do these things for us...by generating a crapload of mostly-redundant bytecode. Oh wait... - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: The Great Startup Problem
On Mon, Aug 25, 2014 at 4:32 AM, Marcus Lagergren wrote: > LambdaForms were most likely introduced as a platform independent way of > implementing methodhandle combinators in 8, because the 7 native > implementation was not very stable, but it was probably a mistake to add them > as “real” classes instead of code snippets that can just be spliced in around > the callsite. (I completely lack history here, so flame me if I am wrong) That's how I remember it, yes. The native impl was not only a bit unstable...it was a security black hole because of all the special-casing for method handles in the JIT, and it had serious issues tracking type information correctly (infamous NCDFE problem). LFs aren't perfect, but we are way better off now than we were with that implementation. I do remember a conversation I had with Chris Thalinger about how it seemed wrong that method handles were treated as a middle grey area between call site and target, potentially not inlining in either direction. My suggestion was to treat all handles bound into a call site as though they were simply added bytecode in the surrounding method...essentially, force inline non-direct handles into the caller immediately (for some definition of "immediately") and let the only remaining decision be which DMHs to inline as well. It worked ok for simple cases we tried, but there were some places it didn't work well. I don't remember the details. We also did a rough equivalent to indy for JRuby's dispatch, but supported on any JVM: * All Ruby-callable methods have unique generated invoker class for arities 0-3,N. These invokers contained all argument adaptation, heap frame management, etc...just like a force-compiled MH chain. * Each call site gets a synthetic method body that does lookup, caching, and dispatch. Dispatch passes directly into those 0-3,N call paths, and for matching arities it should inline straight through (the invokers implement all direct-path arities as direct calls to the appropriate code). These generated call site methods were only monomorphic, but this setup gave us fully inlinable dynamic dispatches without indy. It worked well if we bumped up inlining thresholds (this was pre-incremental JIT) but we shelved it at the time. However, I'm probably going to explore this path again to get near-indy speeds on non-indy JVMs for the new IR-based JIT. Put a bit more directly: I can generate a load of bytecode to get indy-like behavior with or without indy too. The gulf between the current indy implementation and my way – explicitly generating code where and when I need it – is LambdaForm interpretation and translation. > For 9, it seems that we need a way to implement an indy that doesn’t result > in class generation and installation of anonymous runtime classes. Note that > _class installation_ as such is also a large overhead in the JVM - even more > so when we regenerate more code to get more optimal types. I think we need to > move from separate classes to inlined code, or something that requires > minimium bookkeeping. I think this may be subject to profile pollution as > well, but I haven’t been able to get my head around the ramifications yet. I am going to play with the property Jochen mentioned, which forces LFs to JIT much sooner. I feel like we're almost where we need to be, but it feels like LFs need to be more directly represented as IR in C2 rather than going through this foggy middle ground of JVM bytecode. *I* can do foggy JVM bytecode...indy should be doing a lot better than that. Hell, should MethodHandle be backed by Graal IR instead of LFs? It would still be interpretable, but when we go to JIT the chain we're losing a lot less in translation, and we can do site or target-specific specialization at that point. I always saw MHs as a general-purpose call site IR. Maybe we should make good on that. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: The Great Startup Problem
On Sun, Aug 24, 2014 at 12:55 PM, Jochen Theodorou wrote: > afaik you can set how many times a lambda form has to be executed before it > is compiled... what happens if you set that very low... like 1 and disable > tiered compilation? Forcing all handles to compiler early has the same negative effect...most are only called once, and the overhead of reifying them outweighs the cost of interpreting them. I need to play with it more, though. The property I think you're referring to did not appear to help us much. >> We obviously still love working with OpenJDK, and it remains the best >> platform for building JRuby (and other languages). However, our >> failure as a community to address these startup/warmup issues is >> eventually going to kill us. Startup time remains the #1 complaint >> about JRuby, and warmup time may be a close second. > > > how do normal ruby startup times compare to JRuby for a rails app? Perhaps 10x faster startup across the board in C Ruby. With tier 1 we can get it down to 5x or so. It's incredibly frustrating for our users. > All in all, the situation is for the Groovy world quite different I would > say. I'd guess that developers in the Groovy world typically do all their development in an IDE, which can keep a runtime environment available all the time. Contrast this to pretty much everyone not from a Java or C# background, where their IDE is a text editor and a command line. You're also right that it's not quite a fair comparison. Rails is 100% Ruby. Perhaps 50-75% of the libraries in a given app are 100% Ruby. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: The Great Startup Problem
On Sun, Aug 24, 2014 at 12:02 PM, Per Bothner wrote: > On 08/24/2014 03:46 AM, Marcus Lagergren wrote: >> This is mostly invokedynamic related. Basically, an indy callsite >> requires a lot of implicit class and byte code generation, that is >> the source of the overhead we are mostly discussing. While tiered >> compilation adds non determinism, it is usually (IMHO) bearable… Indy aggravates the situation...it's easily an order of magnitude more overhead at boot time. I am also talking about startup time without indy, however. I'll try to be more specific about our boot time overhead later in this reply. > (1) Kawa shows you can have dynamic languages on the JVM that both > run fast and have fast start-up. Like Clojure, I'd only consider Kawa to be *somewhat* dynamic. Most function calls can be statically dispatched, no? I think it's a poor comparison to languages that have fully dynamic method lookup at all (or most) sites. > (2) Other dynamic languages (Ruby, JavaScript, PHP) have had more problems, > possibly because they are "too dynamic". Or perhaps just their kind of > "dynamicism" is a poor match for the JVM. They're not "too dynamic"...they're "pervasively dynamic". But this is a red herring...I don't believe Ruby's dynamism is the source of our startup time issues. > (3) "Too dynamic" does not inherently mean a flaw in either the JVM *or* > the language, just a mis-match. (Though I'm of the school that believes > "more staticness" is better for programmer productivity and software quality > - as well as performance. Finding the right tradeoff is hard.) I believe development tasks will require not just a balance of dynamism and staticism, but a range of languages along that spectrum. There is no one true language, and no one true balance between dynamic and static. I think this is birdwalking away from the original problem, though. > (4) Invokedynamic was a noble experiment to alleviate (2), but so far it > does not seem to have solved the problems. Conceptually, invokedynamic has proven itself incredibly capable. In reality, the implementation has been harder and taken longer than we expected. We're also butting up against a JVM that has been optimized around Java for years...it's hard to teach that old dog new tricks. > (5) It is reasonable to continue to seek improvements in invokedynamic, > but in terms of resource prioritization other enhancement in the Java > platform > (value types, tagged values, reified generics, continuations, removing class > size > limitations, etc etc) are more valuable. Many of which will probably use invokedynamic in some form under the covers. Getting invokedynamic solid, fast, and predictable should be priority one for JVM hackers right now. > (6) That of course does not preclude an "aha": If we made modest change xyz, > that could be a big help. I just don't think Oracle or the community should > spend too much time on "fixing" invokedynamic. I disagree wholeheartedly! Invokedynamic is by far the best tool we have going forward to extend the JVM and languages that run atop it. It's going through growing pains, though. I wanted to describe JRuby's boot time, so people don't think this is a problem of a "too dynamic" language, or solely an invokedynamic issue. As with any other JVM languages, JRuby is almost entirely written in Java. So our entire runtime needs to warm up before we get decent performance. This includes: * A very complicated parser. Ruby's grammar has been designed to accommodate programmers rather than parsers, and it has thousands of productions and state transitions. Note that all Ruby applications boot from source every time they start up. * An AST-based interpreter (JRuby 1.7). The AST nodes call each other, and nested nodes deepen the stack. This is not as efficient, memory-wise, as a flat instruction-based interpreter (IR in JRuby 9000), but it has excellent inlining characteristics. A CallNode typically will call an ArgsNode to process args, a BlockArg node to process captured closures, etc. So the AST kinda-sorta trace JITs on the small. It's worth nothing that JRuby's AST interpreter, once warm, is much faster at running Ruby code than cold, compiled Ruby (JVM bytecode) in the JVM interpreter. * A traditional CFG-based IR compiler (JRuby 9000). We have been working to reduce the overhead of the new compiler, since it is additional overhead compared to JRuby 1.7. We're getting there. * An IR-based interpreter (JRuby 9000). The IR interpreter uses one large frame for the interpreter and small frames for instruction bodies. We have been working to manually inline just enough logic to make the IR interpreter of similar or less overhead compared to the AST interpreter. This may involve the introduction of superinstructions, or we may get things "good enough" and rely on the JVM bytecode JIT to take us the rest of the way. * A JVM bytecode compiler, from either AST or IR. The latter is much simpler, but this is still an
The Great Startup Problem
Marcus coaxed me into making a post about our indy issues. Our indy issues mostly surround startup and warmup time, so I'm making this a general post about startup and warmup. When I started working on JRuby 7 years ago, I hoped we'd have a good answer for poor startup time and long warmup times. Today, the answers are no better -- and in many cases much worse -- than when I started. Here's a summary of our experience over the years... * client versus server Early on, we made JRuby's launcher use client mode by default. This was by far the best way to get good startup performance, but it led to us perpetuating the old question "which mode are you running in" when people reported poor steady-state performance. * Tiered compiler The promise of the tiered compiler was great: client-fast startup with server-fast steady state. In practice, tiered has failed to meet expectations for us. The situation is aggravated by the loss of -client and -server flags. On the startup side, we have found that the tiered compiler never even comes close to the startup time of -client. For a nontrivial app startup, like a Rails app, we see a 50% reduction in startup time by forcing tier 1 (which is C1, the old -client mode) rather than letting the tiered compiler work normally. Obviously limiting ourselves to tier 1 means performance is reduced, but these days our #1 user complain is startup time. So, we have AGAIN taken the step of putting startup-improving flags into our launchers: jruby --dev forces tier 1 + client mode. On the steady-state side, the tiered compiler is rather unpredictable. Some cases will be faster (presumably from better profiling in earlier tiers), while others will be much slower. And it can vary from run to run...tiered steady-state performance is even harder to predict than C2 (-server). We have done no investigation here. * Invokedynamic We love indy. We love it more than just about anyone. But we have again had to make indy support OFF by default in JRuby 1.7.14 and may have to do the same for JRuby 9000. Originally, we had indy off because of the NCDFE bugs in the old implementation. LambdaForms have fixed all that, and with JIT improvements in the past year they generally (eventually) reach the same steady-state performance. Unfortunately, LambdaForms have an enormous startup-time cost. I believe there's two reasons for this: 1. Method handle chains can now result in dozens of lambda forms, making the initial bootstrapping cost much higher. Multiply this by thousands of call sites, all getting hit for the first time. Multiply that by PIC depth. And then remember that many boot-time operations will blow out those caches, so you'll start over repeatedly. Some of this can be mitigated in JRuby, but much of it cannot. 2. Lambda forms are too slow to execute and take too long to optimize down to native code. Lambda forms work sorta like the tiered compiler. They'll be interpreted for a while, then they'll become JVM bytecode for a while, which interprets for a while, then the tiered compiler's first phase will pick it up There's no way to "commit" a lambda form you know you're going to be hitting hard, so it takes FOREVER to get from a newly-bootstrapped call site to the 5 assembly instructions that *actually* need to run. I do want to emphasize that for us, LambdaForms usually do get to the same peak performance we saw with the old implementation. It's just taking way, way too long to get there. Because of these issues, JRuby's new --dev flag turns invokedynamic off, and JRuby 1.7.14 will once again tuen indy off by default on all JVM versions. * Other ways of mitigating startup time We have recommended Nailgun in the past. Nailgun keeps a JVM running in the background, and you toss it commands to run. It works well as long as the commands are actually self-contained, self-cleaning units of work; spin up one thread or leave resources open, and the Nailgun server eventually becomes unusable. We now recommend Drip as a similar solution. For each command you run, Drip attempts to start additional larval JVMs in the background in preparation for future commands. You can configure those instances to pre-boot libraries or application resources, to reduce the work done at startup for the next command (e.g. preboot your Rails application, and then the next command just has to utilize it). Drip is cleaner than Nailgun, but never quite achieves the same startup time without a lot of configuration. It is also a bit of a hack...you can easily preboot something in the "next JVM" that is out of date by the time you use it. CONCLUSION... We obviously still love working with OpenJDK, and it remains the best platform for building JRuby (and other languages). However, our failure as a community to address these startup/warmup issues is eventually going to kill us. Startup time remains the #1 complaint about JRuby, and warmup time may be a close second. What are the rest of you doing to deal with
Re: How high are he memory costs of polymorphic inline caches?
Hello, fellow implementer :-) On Mon, Aug 18, 2014 at 6:01 AM, Raffaello Giulietti wrote: > So, the question is whether some of you has experience with large scale > projects written in a dynamic language implemented on the JVM, that makes > heavy use of indy and PICs. I'm curious about the memory load for the PICs. > I'm also interested whether the standard Oracle server JVM satisfactorily > keeps up with the load. JRuby has implemented call sites using this sort of PIC structure since the beginning. I dare say we were the first. Experimentally, I determined that the cost of a PIC became greater than a non-indy, non-inlining monomorphic cache at about 5 deep. By cost, I mean the overhead involved in dispatching fromthe caller to an empty methodessentially just the cost of the plumbing. Now of course that number's going to vary, but overall a small PIC seems to have value...especially when you consider that the cost of rebinding a call site is rather high. We do have JRuby users using our indy logic in production, and none of their concerns have had any relation to the PIC (usually, it's just startup/warmup concerns). > For example, we have a large Smalltalk application with about 50'000 classes > and about 600'000 methods. In Smalltalk, almost everything in code is a > method invocation, including operators like +, <=, etc. I estimate some 5-10 > millions method invocation sites. How many of them are active during a > typical execution, I couldn't tell. But if the Smalltalk runtime were > implemented on the JVM, PICs would quite certainly represent a formidable > share of the memory footprint. That's why you limit their size. If an inline cache behind a GWT takes up N bytes in memory, a PIC based on the same invalidation logic should just be N * X where X is the depth of your PIC. Of course, you could do what we do in JRuby, and make the PIC depth configurable to try out a few things. > More generally, apart from toy examples, are there studies in real-world > usage of indy and PICs in large applications? > Perhaps some figures from the JRuby folks, or better, their users' > applications would be interesting. The only "studies" we have are the handful of JRuby users running with indy enabled in production. They love the higher performance, hate the startup/warmup time, and most of them had to either bump permgen up or switch to Java 8 to handle the extra code generation happening in indy. I'm happy to answer any other questions. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Loopy CallSite
I played with this some years ago. Doesn't it just become recursive, because it won't inline through the dynamicInvoker? - Charlie (mobile) On Jul 12, 2014 9:36 AM, "Remi Forax" wrote: > It seems that the JIT is lost with whe there is a loopy callsite and never > stabilize (or the steady state is after the program ends). > > import java.lang.invoke.MethodHandle; > import java.lang.invoke.MethodHandles; > import java.lang.invoke.MethodType; > import java.lang.invoke.MutableCallSite; > > public class Loop { > static class LoopyCS extends MutableCallSite { > public LoopyCS() { > super(MethodType.methodType(void.class, int.class)); > > MethodHandle target = dynamicInvoker(); > target = MethodHandles.filterArguments(target, 0, FOO); > target = MethodHandles.guardWithTest(ZERO, > target, > MethodHandles.dropArguments(MethodHandles.constant(int.class, > 0).asType(MethodType.methodType(void.class)), 0, int.class)); > setTarget(target); > } > } > > static final MethodHandle FOO, ZERO; > static { > try { > FOO = MethodHandles.lookup().findStatic(Loop.class, "foo", > MethodType.methodType(int.class, int.class)); > ZERO = MethodHandles.lookup().findStatic(Loop.class, "zero", > MethodType.methodType(boolean.class, int.class)); > } catch (NoSuchMethodException | IllegalAccessException e) { > throw new AssertionError(e); > } > } > > private static boolean zero(int i) { > return i != 0; > } > > private static int foo(int i) { > COUNTER++; > return i - 1; > } > > private static int COUNTER = 0; > > public static void main(String[] args) throws Throwable { > for(int i=0; i<100_000; i++) { > new LoopyCS().getTarget().invokeExact(1_000); > } > System.out.println(COUNTER); > } > } > > cheers, > Rémi > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
FORK
What would it take to make Hotspot forkable? Obviously we'd need to pause all VM threads and restarting them on the other side (or perhaps a prefork mode that doesn't spin up threads?) but I know there's challenges with signal handlers etc. I ask for a few reasons... * Dalvik has shown what you can do with a "larval" preforking setup. This is a big reason why Android apps can run in such a small amount of memory and start up so quickly. * Startup time! If we could fork an already-hot JVM, we could hit the ground running with *every* command, *and* still have truly separate processes. * There's a lot of development and scaling patterns that depend on forking, and we get constant questions about forking on JRuby. * Rubinius -- a Ruby VM with partially-concurrent GC, a signal-handling thread, JIT threads, and real parallel Ruby threads -- supports forking. They bring the threads to a safe point, fork, and restart them on the other side. Color me jealous. So...given that OpenJDK is rapidly expanding into smaller-profile devices and new languages and development patterns, perhaps it's time to make it fit into the UNIX philosophy. Where do we start? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Number of Apps per JVM
I think some of these requirements are at cross purposes. For example, how can you have thread-safe object access but still be able to freely pass objects across "processes"? How can you freely pass objects across in-process VMs but still enforce memory red lines? The better isolation you get between processes/VMs, the more overhead you impose on communication between them. I have to admit I don't know how Kilim does its object isolation. - Charlie On Sun, Jan 12, 2014 at 7:12 PM, Mark Roos wrote: > Thanks for the suggestion on Waratek, not sure how it would address the > process to process > messaging issue. It did lead me to another very interesting read though, > http://osv.io. Again > not an answer for the messaging but something that I have always thought > would be interesting to > try, a stripped down jvm+os. Perhaps JavaOS -). > > thx > mark > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFC: JDK-8031043: ClassValue's backing map should have a smaller initial size
On Thu, Jan 9, 2014 at 7:47 PM, Christian Thalinger wrote: > > On Jan 9, 2014, at 5:25 PM, Charles Oliver Nutter wrote: > >> runtime. Generally, this does not exceed a few dozen JRuby instances >> for an individual app, and most folks don't deploy more than a few >> apps in a given JVM. > > Interesting. Thanks for the information. I forgot to mention: more and more users are going with exactly one JRuby runtime per app, and most Ruby folks deploy one app in a given JVM. So the number of values attached to a class is trending toward 1. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFC: JDK-8031043: ClassValue's backing map should have a smaller initial size
It depends how JRuby is deployed. If the same code runs in every JRuby runtime, then there would be one value attached to a given class per runtime. Generally, this does not exceed a few dozen JRuby instances for an individual app, and most folks don't deploy more than a few apps in a given JVM. - Charlie On Thu, Jan 9, 2014 at 1:16 PM, Christian Thalinger wrote: > > On Jan 9, 2014, at 2:46 AM, Jochen Theodorou wrote: > >> Am 08.01.2014 21:45, schrieb Christian Thalinger: >> [...] >>> If we’d go with an initial value of 1 would it be a performance problem for >>> you if it grows automatically? >> >> that means the map will have to grow for hundreds of classes at startup. >> I don't know how much impact that will have > > If it’s only hundreds it’s probably negligible. You could do a simple > experiment if you are worried: change ClassValueMap.INITIAL_ENTRIES to 1, > compile it and prepend it to the bootclasspath. > >> >> bye Jochen >> >> -- >> Jochen "blackdrag" Theodorou - Groovy Project Tech Lead >> blog: http://blackdragsview.blogspot.com/ >> german groovy discussion newsgroup: de.comp.lang.misc >> For Groovy programming sources visit http://groovy-lang.org >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: JVM Language Summit (Europe)?
Not bad... I think I could manage that. - Charlie On Sat, Oct 5, 2013 at 8:23 AM, Ben Evans wrote: > How about a 1.5 day conference on Thursday 30th & morning of Friday 31st > January? > > Then people who are coming to Europe for FOSDEM can arrive in London on > Weds, have the language summit for 1.5 days & we can all get the Eurostar to > Brussels together to arrive in time for the Delerium cafe? > > Thanks, > > Ben > > > On Sat, Oct 5, 2013 at 11:09 AM, Martijn Verburg > wrote: >> >> Hi all, >> >> Great - I think that's enough positive responses + the ones I got on >> twitter :-). Ben and I will put our thinking caps on and see if we can put >> something very close to FOSDEM so that folks can just Eurostar across (it's >> the only way to travel ;p). >> >> We'll try to grab some sponsorship etc, but I'll warn people that for now >> they should expect to pay their own way for travel and accommodation. >> >> Will post here again when we have some more concrete plans! >> >> >> Cheers, >> Martijn >> >> >> On 5 October 2013 07:30, Cédric Champeau >> wrote: >>> >>> Hi! >>> >>> I am interested too, and I'd vote for an "opposite" summit. >>> >>> Cédric >>> >>> >>> 2013/10/2 Martijn Verburg Hi all, Hope this is the right mailing list to post on, apologies for the slight OT post. A few people asked whether the LJC could/would host a JVM language summit in Europe which would hopefully cover the EMEA based folks that can't make the existing summit. I'd like to get an idea of whether there's appetite for this and if so when it should be run: * At the same time and have some video-conferencing sessions? OR * At a time almost 'opposite' to the existing summit so that there's a summit roughly every 6-months. Ping me directly with your thoughts (unless this is the right mailing list - in that case reply back here). Cheers, Martijn ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>> >>> >>> ___ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>> >> >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: JVM Language Summit (Europe)?
On Oct 4, 2013 5:50 PM, "George Marrows" wrote: > I'd suggest 'opposite' to the existing summit, so that we might get some of the key figures from that conf (Brian Goetz, John Rose, Charlie Nutter etc) over in Europe. Charlie has certainly said he'd be interested in coming to one in Europe, particularly if it preceded/followed another European conf he would like to attend. Absolutely! The FOSDEM Java room has kinda served that purpose but it has a broader focus. I would definitely like an official event. > On Wed, Oct 2, 2013 at 1:38 PM, Martijn Verburg wrote: >> >> Hope this is the right mailing list to post on, apologies for the slight OT post. This is a pretty good list. Also JVM-L (I will forward). >> * At the same time and have some video-conferencing sessions? OR >> * At a time almost 'opposite' to the existing summit so that there's a summit roughly every 6-months. Opposite for sure. I hate to stack another event on the FOSDEM+Jfokus schedule but that would maybe maximize the number of US folks that would be there. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Interpreting Mission Control numbers for indy
A bit more on performance numbers for this application. With no indy, monomorphic caches...the full application (a data load) runs in about a minute. I fully recognize that this is a short run, but JMC seems to indicate the bulk of code has compiled well before the halfway point. With 7u40 or 8, no tiered compilation, it takes about two minutes. Tiered reduces non-indy time to 51s and indy time to 1m29s Tiered + indy + only using monomorphic cache (no direct binding) runs in 1m, still 9s slower than non-indy. With normal settings, indy call sites do settle down and are mostly monomorphic For the two phases of the data load, I stop seeing JRuby bind indy call sites a couple seconds in. There does not appear to be any difference in performance on this app between 7u40 and 8b103. Like I say...I think the user would be willing to share the application, and I feel like the numbers warrant investigation. Standing by! :-) - Charlie On Wed, Sep 18, 2013 at 10:39 AM, Charles Oliver Nutter wrote: > I've been playing with JMC a bit tonight, running a user's application > that's about 2x slower using indy than using trivial monomorphic > caches (and no indy call sites). I'm trying to understand how to > interpret what I see. > > In the Code/Overview results, where it lists "hot packages", the #1 > and #2 packages are java.lang.invoke.LambdaForm$MH and DMH, accounting > for over 37% of samples. That sounds high, but I'm willing to grant > they're hit pretty hard for a fully dynamic application. > > Results in the "Hot Methods" tab show similar things, like > LambdaForm...invokeStatic_LL_L as the number one result and LambdaForm > entries dominating the top 50 entries in the profile. Again, I know > I'm hitting dynamic call sites hard and sampling is not always > accurate. > > If I look at compilation events, I only see a handful of > LambdaForm...convert being compiled. I'm not sure if that's good or > bad. My assumption is that LFs don't show up here because they're > always being inlined into a caller. > > The performance numbers for the app have me worried too. If I run > JRuby with stock settings, we will chain up to 6 call targets at a > call site. The lower I drop this number, the better performance gets; > when I drop all the way to zero, forcing all invokedynamic call sites > to fail over immediately to a monomorphic inline cache, performance > *almost* gets back to the non-indy implementation. This leads me to > believe that the less I use invokedynamic (or the fewer LFs involved), > the better. That doesn't bode well. > > I believe the user would be happy to allow me to make these JMC > recordings available, and I'm happy to re-run with additional events > or gather other information. The JRuby community has a number of very > large applications that push the limits of indy. We should work > together to improve it. > > - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Interpreting Mission Control numbers for indy
I've been playing with JMC a bit tonight, running a user's application that's about 2x slower using indy than using trivial monomorphic caches (and no indy call sites). I'm trying to understand how to interpret what I see. In the Code/Overview results, where it lists "hot packages", the #1 and #2 packages are java.lang.invoke.LambdaForm$MH and DMH, accounting for over 37% of samples. That sounds high, but I'm willing to grant they're hit pretty hard for a fully dynamic application. Results in the "Hot Methods" tab show similar things, like LambdaForm...invokeStatic_LL_L as the number one result and LambdaForm entries dominating the top 50 entries in the profile. Again, I know I'm hitting dynamic call sites hard and sampling is not always accurate. If I look at compilation events, I only see a handful of LambdaForm...convert being compiled. I'm not sure if that's good or bad. My assumption is that LFs don't show up here because they're always being inlined into a caller. The performance numbers for the app have me worried too. If I run JRuby with stock settings, we will chain up to 6 call targets at a call site. The lower I drop this number, the better performance gets; when I drop all the way to zero, forcing all invokedynamic call sites to fail over immediately to a monomorphic inline cache, performance *almost* gets back to the non-indy implementation. This leads me to believe that the less I use invokedynamic (or the fewer LFs involved), the better. That doesn't bode well. I believe the user would be happy to allow me to make these JMC recordings available, and I'm happy to re-run with additional events or gather other information. The JRuby community has a number of very large applications that push the limits of indy. We should work together to improve it. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Reproducible InternalError in lambda stuff
On Mon, Sep 16, 2013 at 2:36 AM, John Rose wrote: > I have refreshed mlvm-dev and pushed some patches to it which may address > this problem. I'll get a build put together and see if I can get users to test it. > If you have time, please give them a try. Do "hg qgoto meth-lfc.patch". > > If this stuff helps we would like to work towards a fix in 7u. > > What is your time frame for JRuby 1.7.5? It is on hold indefinitely while we work out user-reported issues (most are not 7u40-related, but we'd like to have an answer for those before release too). I've attached one user's hs_err dump. This was with a 4GB heap. Code cache full and failing spectacularly? - Charlie hs_err_pid1184.log Description: Binary data ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Reproducible InternalError in lambda stuff
We are getting many reports of memory issues under u40 running appear with indy support. Some seem to go away with bigger heaps, but others are still eventually failing. This is a very high priority for us because we had hoped to release JRuby 1.7.5 with indy enabled (finally) and that may not be possible. On Sep 14, 2013 3:07 PM, "David Chase" wrote: > I am not sure, but it seemed like "something" bad floated into jdk8 for a > little while, and then floated back out again. > I haven't kept close enough track of the gc-dev mailing list, but for a > few days I was frequently running out of memory when I had not been before > (i.e., doing a build, or simply initializing some of the internal tests) -- > this on a machine where when I checked, at least 4G was free for the taking. > > Something happened, and the problems went away. > > On 2013-09-13, at 6:59 PM, Charles Oliver Nutter > wrote: > > > On Sat, Sep 14, 2013 at 12:57 AM, Charles Oliver Nutter > > wrote: > >> * More memory required when running with indy versus without, all > >> other things kept constant (reproduced by two people, one of them me) > > > > I should say *significantly more* memory here. The app Alex was > > running had to go from 1GB heap / 256MB permgen to 2G/512M when it was > > running *fine* before...and this is just for running the *tests*. > > > > - Charlie > > ___ > > mlvm-dev mailing list > > mlvm-dev@openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Reproducible InternalError in lambda stuff
On Sat, Sep 14, 2013 at 12:57 AM, Charles Oliver Nutter wrote: > * More memory required when running with indy versus without, all > other things kept constant (reproduced by two people, one of them me) I should say *significantly more* memory here. The app Alex was running had to go from 1GB heap / 256MB permgen to 2G/512M when it was running *fine* before...and this is just for running the *tests*. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Reproducible InternalError in lambda stuff
I do not...but it appears to be tied to getting an OOM when inside lambda code. We now have a third-party report of the same issue. Because the internal error appears to nuke the original exception, we don't know for sure that this is memory-related, but the user did see *other* threads raise OOM and increasing memory solved it. https://github.com/jruby/jruby/issues/1014 So...there's two things that are bad things here... * More memory required when running with indy versus without, all other things kept constant (reproduced by two people, one of them me) * InternalError bubbling out and swallowing the cause (reproduced by the same two people)...this may count as two issues. My original reproduction did not appear to fire on Java 8, but it also appeared to run forever...so it's possible that we were at a specific memory threshold (permgen? normal heap? meatspace?) or Java 8 may be failing more gracefully. Feel free to discuss or offer suggestions to Alex on the bug report above. I will be monitoring. - Charlie On Mon, Sep 9, 2013 at 6:21 PM, Christian Thalinger wrote: > > On Sep 6, 2013, at 11:11 PM, Charles Oliver Nutter > wrote: > >> I can reproduce this by running a fairly normalish command in JRuby: >> >> (Java::JavaLang::InternalError) >>guard=Lambda(a0:L,a1:L,a2:L,a3:L,a4:L)=>{ >> >> t5:I=MethodHandle(ThreadContext,IRubyObject,IRubyObject,IRubyObject)boolean(a1:L,a2:L,a3:L,a4:L); >> >> t6:L=MethodHandleImpl.selectAlternative(t5:I,(MethodHandle(ThreadContext,IRubyObject,IRubyObject,IRubyObject)IRubyObject),(MethodHandle(ThreadContext,IRubyObject,IRubyObject,IRubyObject)IRubyObject)); >>t7:L=MethodHandle.invokeBasic(t6:L,a1:L,a2:L,a3:L,a4:L);t7:L} >> >> I think it's happening at an OutOfMemory event (bumping up memory >> makes it go away), so it may not be a critical issue, but I thought >> I'd toss it out here. > > Do know where it's coming from? -- Chris > >> >> - Charlie >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Reproducible InternalError in lambda stuff
I can reproduce this by running a fairly normalish command in JRuby: (Java::JavaLang::InternalError) guard=Lambda(a0:L,a1:L,a2:L,a3:L,a4:L)=>{ t5:I=MethodHandle(ThreadContext,IRubyObject,IRubyObject,IRubyObject)boolean(a1:L,a2:L,a3:L,a4:L); t6:L=MethodHandleImpl.selectAlternative(t5:I,(MethodHandle(ThreadContext,IRubyObject,IRubyObject,IRubyObject)IRubyObject),(MethodHandle(ThreadContext,IRubyObject,IRubyObject,IRubyObject)IRubyObject)); t7:L=MethodHandle.invokeBasic(t6:L,a1:L,a2:L,a3:L,a4:L);t7:L} I think it's happening at an OutOfMemory event (bumping up memory makes it go away), so it may not be a critical issue, but I thought I'd toss it out here. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Classes on the stack trace (was: getElementClass/StackTraceElement, was: @CallerSensitive public API, was: sun.reflect.Reflection.getCallerClass)
On Tue, Jul 30, 2013 at 7:17 AM, Peter Levart wrote: > For outside JDK use, I think there are two main needs, which are actually > distinct: > > a) the caller-sensitive methods > b) anything else that is not caller-sensitive, but wants to fiddle with the > call-stack > > For caller-sensitive methods, the approach taken with new > Reflection.getCallerClass() is the right one, I think. There's no need to > support a fragile API when caller-sensitivity is concerned, so the lack of > "int" parameter, combined with annotation for marking such methods is > correct approach, I think. The refactorings to support this change in JDK > show that this API is adequate. The "surface" public API methods must > capture the caller class and pass it down the internal API where it can be > used. This is largely what I advocated and what we do in JRuby. First of all, we've never made any guarantees about calls to caller-sensitive methods like Class.forName. If issues were reported, our answer was to pass in a classloader, just as you would have to do if you had a utility library between your user code and a Class.forName call. Second, the presence of a hidden API to walk the stack is not an excuse for using it and then complaining when it is taken away. Yes yes, Unsafe falls into this category too, but in the case of Unsafe there's no alternative. With getCallerClass, there is an alternative: pass down the caller class. This may not be attractive, especially given the magic provided by getCallerClass before...but it is a solution. Third, for language runtimes like Groovy, it seems to me that only *effort* is required to handle the passing down of the caller class. If we look at the facts, we see that getCallerClass is needed to skip intermediate frames by the runtime. So the runtime knows about these intermediate frames and knows how many to skip. This means that the original call is not into an uncontrolled library, but instead is into Groovy-controlled code. Passing down the caller object or class at that point is obviously possible. Even if the hassle of passing a new additional parameter through the call protocol is too difficult, a ThreadLocal could be utilized for this purpose. For the logging frameworks, I do not have a solution other than the same one we recommend to JRuby users: pass in the class or classloader. I could also suggest generating a backtrace and walking back to the appropriate element, but backtrace generation is currently far too expensive to use in heavily-hit logging code (this should be improved). I will also say that I agree an official stack-walking capability would be incredibly useful, and not just for this case. But that's a much bigger fish to fry and it won't happen in JDK8 timeframe. > Now that is the question for mlvm-dev mailing list: Isn't preventing almost > all Lookup objects obtained by Lookup.in(RequestedLookupClass.class) from > obtaining MHs of @CallerSensitive methods too restrictive? Probably. It seems to me that @CallerSensitive is no different from exposing private methods or fields through a MH. Perhaps it should recalculate caller when it's called, perhaps it should calculate it at lookup time, but not being retrievable at all seems like overkill. I will grant that overkill was probably the quickest and safest solution at the time. > I would point out that this could all easily be solved simply by adding a > getElementClass() method to StackTraceElement, but there was strong > opposition to this, largely due to serialization issues. Since that is > apparently not an option, I propose the following API, based on the various > discussions in the last two months, StackTraceElement, and the API that .NET > provides to achieve the same needs as listed above: A new stack trace getter that provides classes would be an immense improvement, but only if it did not have the same overhead as current stack trace generation. Again, that needs to be fixed. > Furthermore, I propose that we restore the behavior of > sun.reflect.Reflection#getCallerClass(int) /just for Java 7/ since the > proposed above solution cannot be added to Java 7. Probably for 7 and 8. I'm pessimistic about its use, but the timeframe for moving away from it is too short. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: sun.reflect.Reflection.getCallerClass(int) is going to be removed... how to replace?
On Wed, Jul 10, 2013 at 4:30 AM, Cédric Champeau wrote: > I must second Jochen here. That getCallerClass doesn't work anymore in > an update release is unacceptable to me. As Jochen explained, there's no > suitable replacement so far. We can live with getCallerClass > disappearing if there's a replacement, but obviously, the > @CallerSensitive "solution" is not one for us. There are additional > frames in our runtime. Also we need to support multiple JDKs (5 to 8, > but 8 is already broken). Especially, we don't have any replacement for > @Grab which makes use of it internally. Furthermore, I suspect > Class.forName and ResourceBundle.getBundle are widespread in user code > and it used to work. This is not the kind of stuff that people expect to > break when upgrading a JDK, and we can't tell them to rewrite their code > (especially, finding the right classloader might involve more serious > refactoring if it needs to be passed as a method argument). Another alternative we are not using in JRuby...but we could. In JRuby, we pass the calling object into every call site to check Ruby-style visibility at lookup time (we can't statically determine visibility, and we do honor it). That gets you a bit closer to being able to get the caller's classloader without stack tricks (though I admit it does nothing for methods injected into a class from a different classloader). On Wed, Jul 10, 2013 at 4:40 AM, Noctarius wrote: > Maybe a solution could be an annotation to mark calls to not > appear in any stacktrace? Personally, I'd love to see *any* way to teach JVM about language-specific stack traces. Currently JRuby post-processes exception traces to mine out compiled Ruby lines and transform interpreter frames into proper file:line pairs. A way to say "at this point, call back my code to build a StackTraceElement" would be very useful across languages. Of course, omitting from stack trace has very little to do with stack-walking frame inspection tricks like CallerSensitive. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: sun.reflect.Reflection.getCallerClass(int) is going to be removed... how to replace?
We advise our users to pass in a classloader. Class.forName's stack-based discovery of classloaders is too magic anyway. In general, when there's magic happening at the JVM level that is not possible for us to duplicate in JRuby, we warn our users away from depending on it. - Charlie On Mon, Jul 8, 2013 at 3:33 AM, Jochen Theodorou wrote: > Hi all, > > 5 days nothing... Does that mean it is like that, there is no way around > and I have to explain my users, that Java7/8 is going to break some > "minor" functionality? > > bye blackdrag > > -- > Jochen "blackdrag" Theodorou - Groovy Project Tech Lead > blog: http://blackdragsview.blogspot.com/ > german groovy discussion newsgroup: de.comp.lang.misc > For Groovy programming sources visit http://groovy-lang.org > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: jsr292-mock in Maven; coro-mock, unsafe-mock available
On Sun, Jul 7, 2013 at 3:16 PM, Remi Forax wrote: > Given that there is no need to bundle the backport with the jsr292-mock, > I propose you something, > you create a project on github under your name, you give me the right to > push the code > (I will create a textual representation of the API so you will be able > to re-create the jar without > having the right rt.jar available) and after you are free to do what you > want with it :) Ok, that works! I have set up https://github.com/headius/jsr292-mock (you have access) with a pom.xml ready to deploy and a basic README. You can throw whatever you want in there and I'll structure it as appropriate for maven and get an artifact pushed. It would also be fine if you want to put the full generation pipeline in there; I can have the project require Java 8, produce Java 6 (or lower) sources, and just build/push with 8. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
jsr292-mock in Maven; coro-mock, unsafe-mock available
jsr292-mock: Ok Rémi, it's decision time :-) We *need* to get jsr292-mock into maven somehow for JRuby's new build, so we don't have to version the binary anymore. We'd be happy to help set up the maven pom.xml AND get a groupId set up via sonatype's maven service, or we could just start pushing the artifact under a groupId we own (com.headius or org.jruby). Ideally we'd agree between all users where to put it and handle (as a team) getting artifacts pushed. I'm right in thinking this does not change often, right? Unless there's visible API changes in 8, the same artifact will probably get pushed once and not change for a long time. It's up to you (Rémi) and other users whether we should also push the jsr292-backport to maven. We're not using it in JRuby right now. unsafe-mock and coro-mock: I have pushed two new artifacts to maven: com.headius.unsafe-mock and com.headius.coro-mock. unsafe-mock is basically just JDK8's Unsafe.java in artifact form. You would set up your build to fetch it, stick it into bootclasspath, and compile. The intent is to provide a full Unsafe API for compilation only; you must detect in your own code whether certain methods are actually available at runtime. We created this artifact because we use the new JDK8 "fences" API (when available) for our Ruby instance variable tables, but did not want to require JDK8 to build JRuby. coro-mock is a mock of the latest coroutine API from Lukas, provided in artifact form for the same reason. Since the API does not exist in any release JDK, just adding to classpath/dependencies will allow compiling against it. We use it for the Fiber (microthread) library in JRuby (though I'd bet coro does not work anymore...still need to get a JSR going for that). - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: idea: MethodHandle#invokeTailCall
On Fri, May 10, 2013 at 7:16 PM, Per Bothner wrote: > Fail hard is probably the wrong thing to do - except when debugging. > I think what you want is the default to not fail if it can't pop > the stack frame, but that there be a VM option to throw an Error > or even do a VM abort in those case. You'd run the test suite > in this mode. > > That assumes that there is well-specified a minimal set of > circumstances in which the inlining is done correctly, > so a compiler or programmer can count on that, and that this > set is sufficient for low-overhead tail-call elimination. > Making such guarantees would have to be explicit in the JVM spec, and then we're sorta back to requiring a hard tail call guarantee (a hard inlining guarantee to ensure tail calling happens is just a horse of a different color). There is actually a way to force inlining with the newer invokedynamic impl: a @ForceInline (I forget the actual name) annotation that the LambdaForm stuff uses internally. Now, if that were exposed as a standard JVM feature, we could make such a hard guarantee...and then we're back to having to tag calls or callees with annotations, which was something you wanted to avoid (why, exactly?). > I'll be happy when I can run Kawa with --full-tailcalls > as the default with at most a minor performance degradation. > If we don't get there, I'll be satisfied if at least it is > faster (and simpler!) than the current trampoline-based > implementation. There's still a tail call patch in the MLVM repo, rotting on the vine. :-) - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Improving the speed of Thread interrupt checking
On Sat, May 11, 2013 at 3:37 AM, Alexander Turner wrote: > Would not atomic increment and atomic decrement solve the multi-interrupt > issue you suggest here? Such an approach is a little more costly because in > the case of very high contention the setters need to spin to get the > increment/decrement required if using pure CAS. That could be a lot of > cache flushes - but it would then be strictly correct (I don't actually > know how gcc or any other compiler goes about implementing add/sub): > > __sync_fetch_and_sub > __sync_fetch_and_add > Yes, we could guarantee that all interrupts get seen and cleared independently if we used an interrupt counter...but it's clear that's not provided for by the contract of current Thread#interrupt logic, regardless of how atomic you try to make it. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Improving the speed of Thread interrupt checking
An addendum: thread.interrupt *does* have other side effects, like breaking out of blocking IO operations. However, it still doesn't matter; you can mutex to try to guarantee that the IO interrupt and setting the bit happen atomically, but by being in blocking IO you already know the thread is not running. A different sequence: * Thread A performs a blocking IO operation and gets stuck. * Thread B attempts to interrupt it * Thread B acquires the lock, sets interrupt bit, and wakes A out of IO * Thread A wakes up and interrupt bit is set. It is retrieved, cleared, and handled as true But... * Thread A performs a blocking IO operation and gets stuck. * Thread B attempts to interrupt it * Thread A wakes out of blocking IO, sees interrupt bit is not set, and proceeds * Thread B acquires the lock * Thread A ultimately handles the interrupt as false Now, depending on when B decides to proceed with actually interrupting blocking IO, A may have already stopped blocking and B might not see that...unless all interruptible blocking IO operations *also* acquire the interrupt mutex. But that doesn't work either; B can't acquire the interrupt mutex if A is holding it and blocking. If A releases the mutex upon blocking and tries to acquire it immediately after, we're back to square one...B may not see that A has completed blocking because A can't return from a blocking operation and acquire the mutex atomically, and B can't acquire the mutex and check blocking status atomically. It seems like you can't make any guarantees here either, even with locks. - Charlie On Sat, May 11, 2013 at 3:26 AM, Charles Oliver Nutter wrote: > On Sat, May 11, 2013 at 2:49 AM, Jeroen Frijters wrote: > >> I believe Thread.interrupted() and Thread.isInterrupted() can both be >> implemented without a lock or CAS. >> >> Here are correct implementations: >> > ... > >> Any interrupts that happen before we clear the flag are duplicates that >> we can ignore and any that happen after are new ones that will be returned >> by a subsequent call. The key insight is that the interruptPending flag can >> be set by any thread, but it can only be cleared by the thread it applies >> to. >> > > This may indeed be the case. My goal with considering CAS was to maintain > the full behavioral constraints of the existing implementation, which will > never clear multiple interrupts at once, regardless of duplication. > > If your assumption holds, then Vitaly's case is not a concern. His case, > again: > > * Thread A retrieves interrupt status > * Thread B sets interrupt, but cannot clear it from outside of thread A > * Thread A clears interrupt > > The end result of this sequence is indeed different if A's get + clear are > not atomic: the interrupt status after A returns would be clear rather than > set. However, *it does not really matter*. > > If we look at the *caller* of the interrupt checking, things become > obvious. > > Mutexed/atomic version: > > * Thread A makes a call to Thread.interrupt to get and clear interrupt > status > * Thread A acquires lock and gets interrupt status and clears it atomically > * Thread A returns from Thread.interrupt, reporting that the thread was > interrupted, and the caller knows it has been cleared > * Before Thread A proceeds any further (raising an error, etc), thread B > comes in and sets interrupt status. > > The result is that the interrupt is set, and there's nothing A can do to > ensure it has been cleared. A subsequent call to Thread.interrupted can be > preempted *after* the clear anyway. > > So, a different preemption order with mutex: > > * Thread A makes a call to Thread.interrupt to get and clear interrupt > status > * Before the mutex is acquired, Thread B swoops in, setting interrupt > status. > * Thread A proceeds to acquire mutex and only sees a single interrupt bit; > it gets status and clears it. > > So even an atomic version does nothing to guarantee what the interrupt > status will be after all threads are finished fiddling with the interrupt > bit; preemption can happen before or after the mutexed operation, producing > different results in both cases. > > Ultimately, this may actually be a flaw with the way Thread interrupt > works in the JVM. If there's potential for interrupt to be set twice or > more, the interrupted thread can't ever guarantee that the interrupt has > been cleared. > > In practice, this flaw may not matter; if you have one or more external > threads that interrupt a target thread N times, you have to assume (and > have always had to assume) the target thread will see anywhere from 1 to N > of those interrupts, depending on preemption. This does not change with any > of
Re: Improving the speed of Thread interrupt checking
On Sat, May 11, 2013 at 2:49 AM, Jeroen Frijters wrote: > I believe Thread.interrupted() and Thread.isInterrupted() can both be > implemented without a lock or CAS. > > Here are correct implementations: > ... > Any interrupts that happen before we clear the flag are duplicates that we > can ignore and any that happen after are new ones that will be returned by > a subsequent call. The key insight is that the interruptPending flag can be > set by any thread, but it can only be cleared by the thread it applies to. > This may indeed be the case. My goal with considering CAS was to maintain the full behavioral constraints of the existing implementation, which will never clear multiple interrupts at once, regardless of duplication. If your assumption holds, then Vitaly's case is not a concern. His case, again: * Thread A retrieves interrupt status * Thread B sets interrupt, but cannot clear it from outside of thread A * Thread A clears interrupt The end result of this sequence is indeed different if A's get + clear are not atomic: the interrupt status after A returns would be clear rather than set. However, *it does not really matter*. If we look at the *caller* of the interrupt checking, things become obvious. Mutexed/atomic version: * Thread A makes a call to Thread.interrupt to get and clear interrupt status * Thread A acquires lock and gets interrupt status and clears it atomically * Thread A returns from Thread.interrupt, reporting that the thread was interrupted, and the caller knows it has been cleared * Before Thread A proceeds any further (raising an error, etc), thread B comes in and sets interrupt status. The result is that the interrupt is set, and there's nothing A can do to ensure it has been cleared. A subsequent call to Thread.interrupted can be preempted *after* the clear anyway. So, a different preemption order with mutex: * Thread A makes a call to Thread.interrupt to get and clear interrupt status * Before the mutex is acquired, Thread B swoops in, setting interrupt status. * Thread A proceeds to acquire mutex and only sees a single interrupt bit; it gets status and clears it. So even an atomic version does nothing to guarantee what the interrupt status will be after all threads are finished fiddling with the interrupt bit; preemption can happen before or after the mutexed operation, producing different results in both cases. Ultimately, this may actually be a flaw with the way Thread interrupt works in the JVM. If there's potential for interrupt to be set twice or more, the interrupted thread can't ever guarantee that the interrupt has been cleared. In practice, this flaw may not matter; if you have one or more external threads that interrupt a target thread N times, you have to assume (and have always had to assume) the target thread will see anywhere from 1 to N of those interrupts, depending on preemption. This does not change with any of the proposed implementations. The only safe situation is when you know interruption will happen only once within a critical section of code. Put simply (tl;dr): even with atomic/mutexed interrupt set+clear, you can't make any guarantees about how many interrupts will be seen if multiple interrupts are attempted. If true, the mutex in the current implementation is 100% useless. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Improving the speed of Thread interrupt checking
On Sat, May 11, 2013 at 1:46 AM, Alexander Turner wrote: > Thanks for the explanation. I have recently (for the last 6 months) been > involved with some very performance centric multi-threaded work in > profiling the JVM. Using JVMTI as a profiling tool with C++ underneath. The > code all uses JVM locks where locks are required - but as profilers need to > be as invisible as possible I have been removing locks where they can be > avoided. > > My experience here has indicated that on modern machies CAS operations are > always worth a try compared to locks. The cost of loosing the current > quantum (even on *NIX) is so high that it is not worth paying unless a > thread is truly blocked - e.g. for IO. > ... > In your case, inter-thread signalling is definitely not work loosing a > quantum over. > > If I get chance over the next couple of days I'll make great a cut down > example of CAS over thread.interup and run the profiler (DevpartnerJ) over > it - it could be a great unit test. > Yes, it could be illustrative. Finding this code in Hotspot also makes me wonder what other VM-level state is "excessively guarded" by using locking constructs instead of lock-free operations like CAS. The code involved is also not particularly complex. I may see if I can hack in a CAS version of the interrupt check+clear logic and see how things look as a result. The next step would be moving that CAS directly into the intrinsic, so it can optimize along with code calling it. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Improving the speed of Thread interrupt checking
SwitchPoint is indeed an option, and I have used it in JRuby's compiler to reduce the frequency of checking for interrupt events. However in this case it is just a plain old library written in plain old Java that supports Java 6+. Using Indy stuff isn't really an option. Plus...it doesn't solve the performance issue of Thread.interrupted anyway :-) - Charlie (mobile) On May 10, 2013 5:48 PM, "Remi Forax" wrote: > On 05/10/2013 06:03 PM, Charles Oliver Nutter wrote: > > This isn't strictly language-related, but I thought I'd post here > > before I start pinging hotspot folks directly... > > > > We are looking at adding interrupt checking to our regex engine, Joni, > > so that long-running (or never-terminating) expressions could be > > terminated early. To do this we're using Thread.interrupt. > > > > Unfortunately our first experiments with it have shown that interrupt > > checking is rather expensive; having it in the main instruction loop > > slowed down a 16s benchmark to 68s. We're reducing that checking by > > only doing it every N instructions now, but I figured I'd look into > > why it's so slow. > > > > Thread.isInterrupted does currentThread().interrupted(), both of which > > are native calls. They end up as intrinsics and/or calling > > JVM_CurrentThread and JVM_IsInterrupted. The former is not a > > problem...accesses threadObj off the current thread (presumably from > > env) and twiddles handle lifetime a bit. The latter, however, has to > > acquire a lock to ensure retrieval and clearing are atomic. > > > > So then it occurred to me...why does it have to acquire a lock at all? > > It seems like a get + CAS to clear would prevent accidentally clearing > > another thread's re-interrupt. Some combination of CAS operations > > could avoid the case where two threads both check interrupt status at > > the same time. > > > > I would expect the CAS version would have lower overhead than the hard > > mutex acquisition. > > > > Does this seem reasonable? > > > > - Charlie > > Hi Charles, > if a long-running expression is an exception, I think it's better to use > a SwitchPoint > (or a MutableCallsite stored in a static final field for that). > Each regex being parsed first register itself in a queue, a thread wait > on the first item of the queue, > it the time is elapsed, the SwitchPoint is switch-off, so each thread > Joni parsers knows that something goes > wrong and check the first item. The parser which timeout, remove itself > from the queue and create a new SwitchPoint. > So the check is done only when a parser run too long. > > Rémi > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Improving the speed of Thread interrupt checking
For your ABA case, I can think of a couple options: * instead of get, do getAndSet when clearing. Whether it is true or false, it will end up false, so clearing is not a big deal. However, we're always doing the write then, so perhaps... * CAS(true, false) instead of just reading. If set, it will be cleared. If unset, CAS will fail and we know it was not set. Again, not sure about the cost of this versus the simple read. It should usually fail, and I don't know that cost either. I am not sure at what point the lock becomes the cheaper option, but it seems like it would still be more expensive than either of these. And the clearing case is actually the common one; most users call Thread.interrupted, which gets and clears all at once. Even if you use the non-clearing Thread#isInterrupted, you probably still need to clear it after you respond to the interruption...in our case, raising an appropriate error to indicate the regex did not return in a reasonable amount of time. We don't want interrupt flag to linger after that error is handled. - Charlie (mobile) On May 10, 2013 6:51 PM, "Vitaly Davidovich" wrote: > How would you handle the following with just CAS: > 1) thread A reads the status and notices that it's set, and then gets > preemepted > 2) thread B resets the interrupt and then sets it again > 3) thread A resumes and does a CAS expecting the current state to be > interrupted, which it is - CAS succeeds and resets interrupt > > The problem is that it just reset someone else's interrupt and not the one > it thought it was resetting - classic ABA problem. > > You'd probably need some ticketing/versioning built in there to detect > this; perhaps use a uint with 1 bit indicating status and the rest is > version number - then can do CAS against that encoded value. > > However, I'm not sure if this case (checking interrupt and clearing) is > all that common - typically you just check interruption only - and so > unclear if this is worthwhile. > > Sent from my phone > On May 10, 2013 12:05 PM, "Charles Oliver Nutter" > wrote: > >> This isn't strictly language-related, but I thought I'd post here before >> I start pinging hotspot folks directly... >> >> We are looking at adding interrupt checking to our regex engine, Joni, so >> that long-running (or never-terminating) expressions could be terminated >> early. To do this we're using Thread.interrupt. >> >> Unfortunately our first experiments with it have shown that interrupt >> checking is rather expensive; having it in the main instruction loop slowed >> down a 16s benchmark to 68s. We're reducing that checking by only doing it >> every N instructions now, but I figured I'd look into why it's so slow. >> >> Thread.isInterrupted does currentThread().interrupted(), both of which >> are native calls. They end up as intrinsics and/or calling >> JVM_CurrentThread and JVM_IsInterrupted. The former is not a >> problem...accesses threadObj off the current thread (presumably from env) >> and twiddles handle lifetime a bit. The latter, however, has to acquire a >> lock to ensure retrieval and clearing are atomic. >> >> So then it occurred to me...why does it have to acquire a lock at all? It >> seems like a get + CAS to clear would prevent accidentally clearing another >> thread's re-interrupt. Some combination of CAS operations could avoid the >> case where two threads both check interrupt status at the same time. >> >> I would expect the CAS version would have lower overhead than the hard >> mutex acquisition. >> >> Does this seem reasonable? >> >> - Charlie >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Improving the speed of Thread interrupt checking
You need CAS because one form of the interrupt check clears it and another does not. So the get + check + set of interrupt status needs to be atomic, or another thread could jump in and change it during that process. If it were just being read, then sure...it could simply be volatile. But since there's a non-atomic operation in there, a race might be possible. I just took a deeper look at the intrinsic, to see if it avoids the lock...but unfortunately it does not. It adds fast paths for when the thread is not interrupted *and* clearing is not requested (Thread.interrupt clears, Thread#isInterrupted does not). So the typical use case of calling Thread.interrupt() to get and clear interrupt status still follows the slow, locking path all the time. We are mitigating this in our code by using Thread#isInterrupted() (th.isInterrupted on a Thread object) to do the frequent checks, and then using Thread.interrupted to clear it only when it has been set. I think this will be ok, but the slow path still seems like it could benefit from a CAS impl instead of a lock. - Charlie On Fri, May 10, 2013 at 11:17 AM, Alexander Turner wrote: > Charles, > > Why bother even using CAS? > > Thread A is monitoring Thread B. Thread B cooperatively checks to see if > it should die. > > Therefore, you only need B to know when A has told it to shut down. > > Therefore, all you need is a volatile boolean. A volatile boolean is very > much faster than a full CAS operation. > http://nerds-central.blogspot.co.uk/2011/11/atomicinteger-volatile-synchronized-and.html > > Best wishes - AJ > > > On 10 May 2013 17:03, Charles Oliver Nutter wrote: > >> This isn't strictly language-related, but I thought I'd post here before >> I start pinging hotspot folks directly... >> >> We are looking at adding interrupt checking to our regex engine, Joni, so >> that long-running (or never-terminating) expressions could be terminated >> early. To do this we're using Thread.interrupt. >> >> Unfortunately our first experiments with it have shown that interrupt >> checking is rather expensive; having it in the main instruction loop slowed >> down a 16s benchmark to 68s. We're reducing that checking by only doing it >> every N instructions now, but I figured I'd look into why it's so slow. >> >> Thread.isInterrupted does currentThread().interrupted(), both of which >> are native calls. They end up as intrinsics and/or calling >> JVM_CurrentThread and JVM_IsInterrupted. The former is not a >> problem...accesses threadObj off the current thread (presumably from env) >> and twiddles handle lifetime a bit. The latter, however, has to acquire a >> lock to ensure retrieval and clearing are atomic. >> >> So then it occurred to me...why does it have to acquire a lock at all? It >> seems like a get + CAS to clear would prevent accidentally clearing another >> thread's re-interrupt. Some combination of CAS operations could avoid the >> case where two threads both check interrupt status at the same time. >> >> I would expect the CAS version would have lower overhead than the hard >> mutex acquisition. >> >> Does this seem reasonable? >> >> - Charlie >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: idea: MethodHandle#invokeTailCall
Interesting idea...comments below. On Fri, May 10, 2013 at 12:44 PM, Per Bothner wrote: > So this idea come to me: Could we just have add a method > that tail-calls a MethodHandle? Maybe some variant of >MethodHandle#invokeAsTailCall(Object... ) > This doesn't require instruction-set or classfile changes, > "only" a new intrinsic method. Of course it's a bit more > complex than that: The actual tailcall to be useful has > to be done in the method that does the invokeAsTailCall, > not the invokeAsTailCall itself. I.e. the implementation > of invokeAsTailCall has to pop only it own (native) stack > frame, but also the caller. > Seems feasible to me. Ideally in any case where Hotspot can inline a method handle call (generally only if it's static final (?) or in constant pool some other way) it should also be able to see that this is a tail invocation of a method call. However...there are cases where Hotspot *can't* inline the handle (dynamically prepared, etc) in which cases you'd want invokeAsTailCall to fail hard, right? Or if it didn't fail...you're not fulfilling the promise of a tail call, and we're back to the debates about whether JVM should support "hard" or "soft" tail calling guarantees. > One problem with invokeAsTailCall is it implies > needless boxing, which may be hard to optimize away. > Perhaps a better approach would be to use invokedynamic > in some special conventional way, like with a magic > CallSite. However, that makes calling from Java more > difficult. > Making it signature-polymorphic, like invokeExact and friends, would avoid the boxing (when inlined). - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Improving the speed of Thread interrupt checking
This isn't strictly language-related, but I thought I'd post here before I start pinging hotspot folks directly... We are looking at adding interrupt checking to our regex engine, Joni, so that long-running (or never-terminating) expressions could be terminated early. To do this we're using Thread.interrupt. Unfortunately our first experiments with it have shown that interrupt checking is rather expensive; having it in the main instruction loop slowed down a 16s benchmark to 68s. We're reducing that checking by only doing it every N instructions now, but I figured I'd look into why it's so slow. Thread.isInterrupted does currentThread().interrupted(), both of which are native calls. They end up as intrinsics and/or calling JVM_CurrentThread and JVM_IsInterrupted. The former is not a problem...accesses threadObj off the current thread (presumably from env) and twiddles handle lifetime a bit. The latter, however, has to acquire a lock to ensure retrieval and clearing are atomic. So then it occurred to me...why does it have to acquire a lock at all? It seems like a get + CAS to clear would prevent accidentally clearing another thread's re-interrupt. Some combination of CAS operations could avoid the case where two threads both check interrupt status at the same time. I would expect the CAS version would have lower overhead than the hard mutex acquisition. Does this seem reasonable? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: JVM Summit Wrokshop/talk request
I think we can safely say there's a lot of interest in a European JVMLS-like event. I'll make every effort to be there if one is hosted...but I'll leave it up to you European folks to figure out where and when :-) Just give me plenty of advance notice ;-) - Charlie On Fri, Apr 12, 2013 at 5:28 AM, Ben Evans wrote: > +1 > > Stockholm's nice. Or there's always London... > > > On Fri, Apr 12, 2013 at 6:36 AM, Marcus Lagergren > wrote: >> >> +1 to that. We could probably host something in Stockholm too, if there >> is interest. (We are the third largest Oracle JVM engineering site in the >> world after Santa Clara and Burlington, MA). >> >> /M >> >> On Apr 11, 2013, at 10:09 PM, Charles Oliver Nutter >> wrote: >> >> > I would absolutely love to have a European edition of the JVM Language >> > Summit. It's not a very complicated event to put together, and it >> > would give us an opportunity to meet with more language folks than can >> > make it to California for JVMLS. >> > >> > You could count on my attendance. >> > >> > - Charlie >> > >> > On Thu, Apr 11, 2013 at 5:30 AM, MacGregor, Duncan (GE Energy >> > Management) wrote: >> >> I would certainly be interested, though travel budgets do seem to be >> >> tight >> >> this year. >> >> >> >> We could probably host it here in Cambridge if you guys want to come >> >> over >> >> to the UK. >> >> >> >> On 09/04/2013 08:19, "Julien Ponge" wrote: >> >> >> >>> Just an idea: would some of you be interested in having a meeting at >> >>> some >> >>> point in Europe? >> >>> >> >>> I (or Rémi) can probably organise something at our Unis. >> >>> >> >>> - Julien >> >>> >> >>> On Apr 9, 2013, at 4:55 AM, Mark Roos wrote: >> >>> >> >>>> Thanks for the interest. >> >>>> >> >>>> I added this workshop to my proposal. Inputs are welcome on how to >> >>>> make it a good workshop. >> >>>> >> >>>> mark >> >>>> >> >>>> Improving the performance of InvokeDynamic >> >>>> >> >>>> Now that we have some experience with InvokeDynamic its time >> >>>> to discuss strategies and efforts for performance improvement. >> >>>> We expect to have experts, HotSpot implementers and users >> >>>> discussing how to get the best performance >> >>>> possible.___ >> >>>> mlvm-dev mailing list >> >>>> mlvm-dev@openjdk.java.net >> >>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >>> >> >>> ___ >> >>> mlvm-dev mailing list >> >>> mlvm-dev@openjdk.java.net >> >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> >> >> ___ >> >> mlvm-dev mailing list >> >> mlvm-dev@openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> > ___ >> > mlvm-dev mailing list >> > mlvm-dev@openjdk.java.net >> > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: JVM Summit Wrokshop/talk request
I would absolutely love to have a European edition of the JVM Language Summit. It's not a very complicated event to put together, and it would give us an opportunity to meet with more language folks than can make it to California for JVMLS. You could count on my attendance. - Charlie On Thu, Apr 11, 2013 at 5:30 AM, MacGregor, Duncan (GE Energy Management) wrote: > I would certainly be interested, though travel budgets do seem to be tight > this year. > > We could probably host it here in Cambridge if you guys want to come over > to the UK. > > On 09/04/2013 08:19, "Julien Ponge" wrote: > >>Just an idea: would some of you be interested in having a meeting at some >>point in Europe? >> >>I (or Rémi) can probably organise something at our Unis. >> >>- Julien >> >>On Apr 9, 2013, at 4:55 AM, Mark Roos wrote: >> >>> Thanks for the interest. >>> >>> I added this workshop to my proposal. Inputs are welcome on how to >>> make it a good workshop. >>> >>> mark >>> >>> Improving the performance of InvokeDynamic >>> >>> Now that we have some experience with InvokeDynamic its time >>> to discuss strategies and efforts for performance improvement. >>> We expect to have experts, HotSpot implementers and users >>> discussing how to get the best performance >>>possible.___ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >>___ >>mlvm-dev mailing list >>mlvm-dev@openjdk.java.net >>http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: [jvm-l] Improving the performance of stacktrace generation
I talked a bit with John Rose about this, and he agreed with me that a good partial measure might be to add APIs for getting a *partial* stack. Currently, Hotspot will limit how deep a stack trace it generates. This can have a very large impact on the performance of generating traces. The magic flag is -XX:MaxJavaStackTraceDepth=, and the default on my system is 1024. Here's a set of benchmarks of various trace depths from 1000 down to 2. Once you get down to 100 frames, performance of generating a stack trace starts to improve considerably. https://gist.github.com/headius/5365217 Unfortunately there's no API to get just a partial stack trace, via JVMTI or otherwise. The relevant code in Hotspot itself is rather simple; I started prototyping a JNI call that would allow getting a partial trace. Perhaps something like: thread.getStackTrace(depth) ...and something equivalent for JVMTI. John agreed that this would be a worthwhile feature for a JEP, and I'd certainly like to see it trickle into a standard API too. - Charlie On Thu, Apr 11, 2013 at 3:37 AM, wrote: > Hi Bob, > > I wrote an article last year on the cost and impact of JVMTI stack collection. > > http://www.jinspired.com/site/is-jvm-call-stack-sampling-suitable-for-monitoring-low-latency-trading-apps > > I would prefer to see the JVM come up with a standard API and mechanism to > allow the stack to be augmented with additional frames that not only include > Java code but more contextual information related to executing activity > (code, block, flow,) this would include other JVM languages. > > We provide this sort of thing already today for Java, JRuby/Ruby and > Jython/Python, even SQL, in our metering engine but would welcome an ability > to replicate this data to the VM itself so standard tools need not be > changed. What is cool about this is that we can simulate a stack in a remote > JVM that spans multiple real application runtimes. > > http://www.jinspired.com/site/jxinsight-opencore-6-4-ea-12-released > > Kind regards, > > William > >>-Original Message- >>From: Bob Foster [mailto:bobfos...@gmail.com] >>Sent: Sunday, July 8, 2012 01:32 AM >>To: jvm-langua...@googlegroups.com >>Cc: 'Da Vinci Machine Project' >>Subject: Re: [jvm-l] Improving the performance of stacktrace generation >> >>> Any thoughts on this? Does anyone else have need for >>lighter-weight name/file/line inspection of the call stack? >> >>Well, yes. Profilers do. >> >>Recall Cliff Click bragging a couple of years ago at the JVM Language >>Summit about how fast stack trace generation is in Azul Systems' OSs...and >>knocking Hotspot for being so slow. It turns out that stack trace >>generation is a very significant overhead in profiling Hotspot using JVMTI. >>Even CPU sampling on 20 ms. intervals can add 3% or more to execution time, >>almost entirely due to the delay in reaching a safe point (which also >>guarantees the profile will be incorrect) and generating a stack trace for >>each thread. >> >>But 3% is peanuts compared to the cost of memory profiling, which can >>require a stack trace on every new instance creation. In a profiler I wrote >>using JVMTI, I discovered that it was faster to call into JNI code on every >>method entry and exit (and exception catch), keeping a stack trace >>dynamically than to call into JNI only when memory was allocated and >>request a stack trace each time. The "fast" technique is about 3-10 times >>slower than running without profiling. The Netbeans profiler doesn't use >>this optimization, and its memory profiler when capturing every allocation, >>as I did, is 2-3 ORDERS OF MAGNITUDE slower than normal (non-server) >>execution. >> >>Faster stack traces would benefit the entire Hotspot profiling community. >> >>Bob >> >>On Sat, Jul 7, 2012 at 3:03 PM, Charles Oliver Nutter >>wrote: >> >>> Today I have a new conundrum for you all: I need stack trace >>> generation on Hotspot to be considerably faster than it is now. >>> >>> In order to simulate many Ruby features, JRuby (over)uses Java stack >>> traces. We recently (JRuby 1.6, about a year ago) moved to using the >>> Java stack trace as the source of our Ruby backtrace information, >>> mining out compiled frames and using interpreter markers to peel off >>> interpreter frames. The result is that a Ruby trace with mixed >>> compiled and interpreted code like this >>> (https://gist.github.com/3068210) turns into this >>> (https://gist.github.com/3068213). I consider this a great deal better >>> than the plain J
Re: JVM Summit Wrokshop/talk request
I will volunteer to be an expert. On Mon, Apr 8, 2013 at 2:53 PM, Mark Roos wrote: > I would love to put it together, but my knowledge is minimal. I don't mind > the > organizing part but I think we need some folks from the jvm side to be the > main > speaker/know it all(s). > > So if we can get some volunteers to be the experts I will gladly propose and > mc > a workshop/panel > > mark > > > > > From:Charles Oliver Nutter > To:Da Vinci Machine Project > Date:04/08/2013 12:26 PM > Subject:Re: JVM Summit Wrokshop/talk request > Sent by:mlvm-dev-boun...@openjdk.java.net > > > > > Indeed...I think we need to get all us invokedynamicists into the same > room to better understand what's working, what's not, and where to go > from here. Consider me in. > > I'm sure it would be accepted, so a proposal would probably be a > formality...but do you want to throw something together, Mark? > > - Charlie > > On Mon, Apr 8, 2013 at 2:03 PM, Mark Roos wrote: >> It seems like quite a bit of work is going on around improving the >> performance of invokeDynamic. >> It would be interesting ( at least to me ) to have an in depth discussion >> of >> what is being done and >> how I should adjust my usage to get the best performance for a dynamic >> language. >> >> I'll buy the drinks >> >> mark >> >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: JVM Summit Wrokshop/talk request
Indeed...I think we need to get all us invokedynamicists into the same room to better understand what's working, what's not, and where to go from here. Consider me in. I'm sure it would be accepted, so a proposal would probably be a formality...but do you want to throw something together, Mark? - Charlie On Mon, Apr 8, 2013 at 2:03 PM, Mark Roos wrote: > It seems like quite a bit of work is going on around improving the > performance of invokeDynamic. > It would be interesting ( at least to me ) to have an in depth discussion of > what is being done and > how I should adjust my usage to get the best performance for a dynamic > language. > > I'll buy the drinks > > mark > > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Looking for comments on paper draft "DynaMate: Simplified and optimized invokedynamic dispatch"
If it's not too late...I'd like to see the paper too :-) And I also wonder whether we should start consolidating approaches a bit. InvokeBinder has become very feature-rich, now providing the ability to track arguments by name through the MH chain. I'm hoping to fill it out more and do a new release soon, but I'm using it for just about all my MH wrangling. - Charlie On Tue, Feb 19, 2013 at 7:37 AM, Eric Bodden wrote: > Hi all. > > Kamil Erhard, a student of mine, and myself have prepared a paper > draft on a novel framework for invokedynamic dispatch that we call > DynaMate. The framework is meant to aid language developers in using > java.lang.invoke more easily by automatically taking care of common > concerns like guarding and caching of method handles or adapting > arguments between callers and callees. > > By March 28th, we plan to submit the draft to OOPSLA, at which point > we will probably also make the publication available as a Technical > Report, and will also open-source the implementation. Right now, I > would like to use this email to reach out to experts in the community > to get some feedback on this work, both in terms of what could be > improved w.r.t. the paper and in terms of the DynaMate framework > itself. > > So please let me know if you are interested in obtaining a copy of the > draft to then provide us with feedback. In this case I would email you > the PDF some time this week. > > Best wishes, > Eric > > P.S. Here is the current abstract: > > Version 7 of the Java runtime includes a novel invokedynamic bytecode > and API, which allow the implementers of programming languages > targeting the Java Virtual Machine to customize the dispatch semantics > at every invokedynamic call site. This mechanism is quite powerful and > eases the implementation of dynamic languages, but is is also hard to > handle, as it allows for many degrees of freedom and much room for > error. While implementers of some dynamic languages have successfully > switched to using invokedynamic, others are struggling with the steep > learning curve. > We present DYNAMATE, a novel framework allowing dynamic-language > implementers to define dispatch patterns more easily. Implementations > using DYNAMATE achieve reduced complexity, improved maintainability, > and optimized performance. Moreover, future improvements to DYNAMATE > can benefit all its clients. > As we show, it is easy to modify the implementations of Groovy, JCop, > JRuby, Jython to base their dynamic dispatch on DYNAMATE. A set of > representative benchmarks shows that DYNAMATE-enabled dispatch code > usually achieves equal or better performance compared to the code that > those implementations shipped with originally. DYNAMATE is available > as an open-source project. > > -- > Eric Bodden, Ph.D., http://sse.ec-spride.de/ http://bodden.de/ > Head of Secure Software Engineering Group at EC SPRIDE > Tel: +49 6151 16-75422Fax: +49 6151 16-72051 > Room 3.2.14, Mornewegstr. 30, 64293 Darmstadt > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Perf regression since b72
I've been fiddling about with performance a bit again recently, and have noticed a perf degradation since b72. I mentioned this to the Nashorn guys and Marcus discovered that InlineSmallCode=2000 helped them get back to b72 performance. I can confirm this on JRuby as well, but in any case it seems that something has regressed. Here's some numbers with JRuby. Numbers are for b72, hotspot-comp, and hotspot-comp with InlineSmallCode=2000. You can see that current hotspot-comp builds do not perform as well as b72 unless that flag is passed. https://gist.github.com/headius/de7f99b52847c2436ee4 I have not yet started to explore the inlining or assembly output, but I wanted to confirm that others are seeing this degradation. My build of hotspot-comp is current. I do have some benchmarks that look fine without the additional flag (neural_net, for example), so I'm confused what's different in the degraded cases. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: New Ruby impl based on PyPy...early perf numbers ahead of JRuby
On Sat, Feb 9, 2013 at 1:07 PM, Thomas Wuerthinger wrote: > Do you also have startup performance metrics - I assume the numbers below > are about peak performance? It seems to warm up very quickly; there's sometimes 2x slower perf on the first iteration, but it rapidly settles. Overall startup time is considerably better than JRuby. > What is the approximate % of language feature completeness of Topaz and do > you think this aspect is relevant when comparing performance? Hard to say. They consulted me and other Ruby implementers to learn the most difficult features to implement, and made sure they put those in place. But I've had a lot of trouble with these benchmarks, partially due to missing language features. The specific language features we recommended they implement before measuring perf are mostly related to closure state and cross-frame variable access. In JRuby, such things require allocation on the heap, and since closure-receiving methods don't specialize (or do context-sensitive caller-callee profiling) EA can't ever get rid of those structures. The allocation and value indirection kills us. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
New Ruby impl based on PyPy...early perf numbers ahead of JRuby
So, that new Ruby implementation I hinted at was announced this week. It's called Topaz, and it's based on the RPython/PyPy toolchain. It's still very early days, of course, since the vast majority of Ruby core has not been implemented yet. But for the benchmarks it can run, it usually beats JRuby + invokedynamic. Some numbers... Richards is 4-5x faster on Topaz than JRuby. Red/black is a bit less than 2x faster on Topaz than the JRuby with the old indy impl and a bit more than 2x faster than the JRuby with the new impl. Tak and fib are each about 10x faster on JRuby. Topaz's JIT is probably not working right here, perhaps because the benchmarks are deeply recursive. Neural is a bit less than 2x faster on Topaz than on JRuby. I had to do a lot of massaging to get these benchmarks to run due to Topaz's very-incomplete core classes, but you can see where Topaz could potentially give us a run for our money. In general, Topaz is already faster than JRuby, and still implements most of the "difficult" Ruby language features that usually hurt performance. My current running theory for a lot of this performance is the fact that the RPython/PyPy toolchain does a better job than Hotspot in two areas: * It is a tracing JIT, so I believe it's specializing code better. For example, closures passed through a common piece of code appear to still optimize as though they're monomorphic all the way. If we're ever going to have closures (or lambdas) perform as well as they should, closure-receiving methods need to be able to specialize. * It does considerably better at escape detection than Hotspot's current escape analysis. Topaz does *not* use tagged integers, and yet numeric performance is easily 10x better than JRuby. This also plays into closure performance. Anyway, I thought I'd share these numbers, since they show we've got more work to do to get JVM-based dynamic languages competitive with purpose-built dynamic language VMs. I'm not really *worried* per se, since raw language performance rarely translates into application performance (app perf is much more heavily dependent on the implementation of core classes, which are all Java code in JRuby and close to irreducible, perf-wise), but I'd obviously like to see us stay ahead of the game :-) - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Symbolic argument support in InvokeBinder
Sitting here at FOSDEM today I was showing Remi my new addition to InvokeBinder: named arguments. Background: InvokeBinder is my little Java DSL/fluent API for building method handle chains. Short example: MethodHandle mh = Binder .from(String.class, String.class, String.class) // String w(String, String) .drop(1, String.class) // String x(String) .insert(0, 'hello') // String y(String, String) .cast(String.class, CharSequence.class, Object.class) // String z(CharSequence, Object) .invoke(someTargetHandle); The new stuff I added is a Signature class for managing a MethodType along with an array of argument names, and SmartBinder to take advantage of that. How is this useful? The above example might be reworked as follows: Signature sig = Signature .returning(String.class) .appendArg("arg1", String.class) .appendArg("arg2", String.class); MethodHandle mh = SmartBinder .from(sig) .drop("arg2") // String x(String) .prepend("argX", 'hello') // String y(String, String) .cast(String.class, CharSequence.class, Object.class) // String z(CharSequence, Object) .invoke(someTargetHandle); So we can always use the argument names rather than error-prone indices. This is especially useful for permutes, which I consistently get completely wrong: MethodHandle incoming = handle with signature below; Signature sig = Signature .returning(String.class) .appendArg("arg1", String.class) .appendArg("arg2", String.class) .appendArg("arg3", String.class); // permute without indices! MethodHandle permuted = sig.permuteWith(incoming, "arg1", "arg3"); This is not in an InvokeBinder release yet because I want to add all Binder operations to SmartBinder, but I'm looking for feedback and other use cases for named arguments in the signature. Thanks! - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: hotspot-comp OS X builds
Kris answered about JDK8. As far as JDK7u, you can follow the 7u mailing list. It looks like u12 has been renumbered to u14 (probably to make room for any additional security releases that might be needed in a u13), but you basically just want to track hs24-bXX and related commit info. Others on this list might be able to give you a more definitive answer, but I have mostly been tracking Hotspot versions to know what features are where. FWIW, JRuby actually inspects Hotspot version to know whether to default invokedynamic use to "on", since only hs24+ has fixed the NCDFE issue. - Charlie On Fri, Jan 25, 2013 at 5:56 AM, MacGregor, Duncan (GE Energy Management) wrote: > Can I just check whether all this stuff has made it into the 7u12 or 8 > snapshot releases, and if not when it will? > > Alternatively I can do a Windows build myself from source if its all made > it into the public repos. > > On 24/01/2013 22:47, "John Rose" wrote: > >>Thanks, Charlie! >> >>Yes, feedback makes us happy, especially small-but-representative >>benchmarks. >> >>‹ John >> >>On Jan 24, 2013, at 1:21 PM, Charles Oliver Nutter wrote: >> >>> I did some builds of hotspot-comp as of this afternoon for y'all to >>> download. This has the permgen removal, new indy impl + opto, partial >>> inlining, and other bits and bobs. >>> >>> I'm sure the Hotspot guys would appreciate feedback on indy >>> performance. As far as I know, all the indy opto stuff in this build >>> is on its way to 7u12, but that window may still be open for >>> additional patches. >>> >>> https://s3.amazonaws.com/openjdk/index.html >>> >>> - Charlie >>> ___ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >>___ >>mlvm-dev mailing list >>mlvm-dev@openjdk.java.net >>http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev