Re: A simple PIC api
Hi Jochen On why I am looking at projections for behavior determination. In my case I use an object reference in the receiver to determine the method to execute. One could say that this is a projection from the class to an address which is an integer. So in this case computing the projection is the same as testing for a class match. However I see this as limiting. The address could be 64 bits but I only have 5000 classes so many of the bits are not used. My theory is that if I replace the object reference with a long ( call it a bit vector ) and then compute the mapping of the class behavior to some set of bits then I can compute the target behavior by logical operations on the projections for each arg. Since these would all be ops on longs ( or ints maybe) they would be fast to perform. So my current efforts at analysis is to handle the following use cases. For call sites where the number of receivers it less than some small number I plan to just use the standard compare the class behavior approach. This could be faster if the item compared is a integer vs a reference. For cases where there are only a few implementations of a method but many receivers. I plan to use some logical combination of bits so select the method. For cases where the class of an argument plus the receiver is used. ( specialization ). Some bits again. For the case of many methods and many receivers. Not sure about this yet. This is usually some method like 'name' which returns a constant. Perhaps moving the constant to an instance var would trade memory size for performance. I also believe that I can determine these projections via a static analysis plus some profiling. One question will be how much does the profiling help. It may be true that a tree structured test is just as effective as manipulations of a bit vector. It may also be true that the cases which could benefit from projections do not impact the apps performance. Makes for a nice analysis. In any case I agree with you on performance being the goal. Probably a optimal GWT chain will in the end be all that is needed. regards mark ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Am 16.03.2015 00:42, schrieb Mark Roos: Jochen writes So far I have avoided using projections since I am not sure about how to do this. Basically I am missing a way to project a Class to an usable int. After this discussion, reading Rémi's suggested paper and some more analysis of my code base I am beginning to think that a projection based PIC could be a good solution. I agree with your concern about how such a projection would be computed but I feel that there is a solution for that. Hopefully it will hold up in reality. I was waiting a while to see if maybe somebody else jumps in and gives a suggestion... I had a bad experience once with trying to make a key for a method selection for caching in a hashmap. This is kind of an projection as well. But anyway... in the end the caching there costs almost as much as doing the method selection and the gain is near zero. An we are here the focus is not the cost of method selection so much as more an efficient way to compute an int out of receiver and arguments , which will be more efficient than a cascade of guards where each checks the classes for a cached call. In both cases you would have to get the classes of the arguments and the receiver once. And in both cases you would need a loop to check over the classes, unless you can be sure that your projection will be bimorphic. And I doubt that would be the case. At least not bimorphic and efficient to compute at the same time. So in the end it is a question of what can be optimized better. bye Jochen -- Jochen blackdrag Theodorou - Groovy Project Tech Lead blog: http://blackdragsview.blogspot.com/ german groovy discussion newsgroup: de.comp.lang.misc For Groovy programming sources visit http://groovy-lang.org ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Jochen writes So far I have avoided using projections since I am not sure about how to do this. Basically I am missing a way to project a Class to an usable int. After this discussion, reading Rémi's suggested paper and some more analysis of my code base I am beginning to think that a projection based PIC could be a good solution. I agree with your concern about how such a projection would be computed but I feel that there is a solution for that. Hopefully it will hold up in reality. regards mark___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Am 12.03.2015 19:37, schrieb Mark Roos: Jochen The comment on the test part of the pic is interesting. Since I am looking at multimethods I would like to have a better understanding of how you decide which code to dispatch at a site. We currently have no PIC in Groovy. The naive implementation would be using the guards I use to check for the validity of my current method selection. But that means several guards to check the receiver and each argument as well as a switchpoint to ensure the validity of the meta class. So for a N argument call we have up-to N+1 guards. The guard will ensure the value is of the expected class and not null. Since I transport static information if available I can skip the guard for primitives. I could in theory skip the class check for final classes, but since null would be still a legal value and since it would potentially change the selected method, I would still have a guard. What I could do further, which I haven't done yet, is to check the overloads of the method. If there are none, I need only one guard for the receiver. And depending on the signatures I could also omit single guards. Problem is that the internal structures we currently have, have not been written to make this happen efficiently, which is why I did not go further yet. My pic suggestion assumes that one test method is applied to the arguments and its result used to select the code to execute. This may be too limiting to be the general case. But you made me think... considering a guard like this: boolean testArguments(Object[] args, Class[] types) { for (int i=0; itypes.length; i++) { if (types[i]!=null) { if (args[i]==null || args[i].getClass()!=types[i]) return false; } else { if (args[i]!=null) return false; } return true; } plus some dropping of arguments to skip parts I don't want to guard, I could indeed go with one test handle. For null and class. The question though is how the performance of this would be. (Haven't tested it) The way I reason about a pic is that it is presented the current stack args and if it does not see a match the stack and the site are passed to the fallback. The fallback has two responsibilities, to find the code for the current args and to generate a test which when applied to args in the future will dispatch the same code. Yes, that was the same in my suggestion I see this test as a pattern match against some facet of the args. Since I use a chain of GWTs each guard can have a different test/match. I am looking at some form of bit vector for the facets and the pattern to match ( as in the paper Rémi suggested) so I don't have to have a complex tree like test. Yes, a tree of tests should be avoided, I totally agree. Same for the code I have shown above probably. So far I have avoided using projections since I am not sure about how to do this. Basically I am missing a way to project a Class to an usable int. I guess if I wanted to I could use the hashcode sum as a potentially quick first test and then continue from there. Perhaps my original thought of just declaring the site to be a pic and letting the jvm do its best optimizing the target will be the best. needs testing I guess bye Jochen -- Jochen blackdrag Theodorou - Groovy Project Tech Lead blog: http://blackdragsview.blogspot.com/ german groovy discussion newsgroup: de.comp.lang.misc For Groovy programming sources visit http://groovy-lang.org ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Am 12.03.2015 00:13, schrieb Mark Roos: Hi Jochen I have to think about yours some more but I thought I would share mine. I condensed it to make it easier to explain. my pleasure […] I extended callsite to hold a bunch of values one of which is the depth. And my model for the cache is up to 10 chained GWTs after that I drop the chain and start over. When I get a miss in the cache off to the fallback ( look below to see how this MH is made) *static*RtObject fallbackSelf(RtCallSite site, Object... args) For each guard I create a test MH which takes all of the args and returns a bool. Since I only look at the TOS it drops the rest, yours is more complicated. This can be different for each guard in my case. RtObject rcvr=(RtObject)args[args.length- 1]; // top of stack test=MethodHandles./insertArguments/(test, 0, rcvr.classField()); // the behavior field test=MethodHandles./dropArguments/(test, 0, site.dropArray()); // match test signature then finds the code based on the site and the receiver in this case MethodHandle newTarget=rcvr.lookupSelf(site.getSelector()); makes a new GWT with the existing chain as its slow path my main problem is that in most cases a single test won't do. In many cases I have to check at least the class of the receiver as well as of each argument. That makes for a 2 arg call a least 3 tests. But I cannot really express a Java like . Instead I have to use test number 1, with slow path in the false case and the real invocation target in the true case. Then for each additional test I use the root of the MethodHandle tree as the true case and the slow path as fallback. Or is there a better way to do this? Maybe that would be a nice addition to the MethodHandles API actually. Something to be able to express logic and/or. I imagine this could simplify code generation in the JVM and reduce the number of Objets you need. if the cache is full will make the bootstrap MH (see below) the target dropping the existing cache ***if*(site._depth 10) MethodHandle gwt=MethodHandles./guardWithTest/(test, newTarget, site.getRealTarget()); site._depth= site._depth+ 1; // bump the depth inserts it at the start of the chain by making it the new target site.setRealTarget(gwt); and invokes it ( has to spread the args here as they were compressed on the way to the fallback) MethodHandle invoke=newTarget.asSpreader(Object[].*class*, site.getAirity()); result=(RtObject)invoke.invokeExact((Object[])args); yeah, in Remi's example it falls back ot a vtable like approach. But such things would in my proposal go into the megamorphicFallback and there you can of course easily make a handle that will reset the cache Of course the API I suggested is more complicated, because it tries to abstract. Remi's code for example works and is fairly small. But since the goal was to have a certain API the JVM can look at in more detail, since it will follow a certain structure I tried to come up with something we all can use. It is far from perfect of course and surely needs simplification here and there or does not consider other things. I only thought I give that idea to the list to have some base to work with, because I am not sure how I would be able to use your suggest API for Groovy for example bye Jochen -- Jochen blackdrag Theodorou - Groovy Project Tech Lead blog: http://blackdragsview.blogspot.com/ german groovy discussion newsgroup: de.comp.lang.misc For Groovy programming sources visit http://groovy-lang.org ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
We did have optional instrumentation to maintain the PIC counts, and used that to guide our choice of ordering, but we didn¹t use it on a per PIC level to do anything at run time, it was just a case of gathering a lot of data and printing out the stats. It did add some overhead, but I think Vlad¹s work for profiling of GWTs show that it could be done with an efficient API. How much real benefit it would offer though, I don¹t know. Duncan. On 11/03/2015 17:48, Marrows, George A (GE Energy Management) george.marr...@ge.com wrote: so the order of the handles is never changed to for example trying the last one first or the one with the most hits recently. Is it not worth the trouble? We found a useful gain at one point from reordering GWT chains so that the new MH was put on the end of the chain and the chain as a whole was therefore sorted chronologically, with first seen at the top. That obviously requires a bit of book-keeping to arrange. We haven't experimented with maintaining our own hit counts, and in fact I think we last checked the perf gain from reversing the chain on Java 7, so results may no longer be relevant. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Jochen The comment on the test part of the pic is interesting. Since I am looking at multimethods I would like to have a better understanding of how you decide which code to dispatch at a site. My pic suggestion assumes that one test method is applied to the arguments and its result used to select the code to execute. This may be too limiting to be the general case. The way I reason about a pic is that it is presented the current stack args and if it does not see a match the stack and the site are passed to the fallback. The fallback has two responsibilities, to find the code for the current args and to generate a test which when applied to args in the future will dispatch the same code. I see this test as a pattern match against some facet of the args. Since I use a chain of GWTs each guard can have a different test/match. I am looking at some form of bit vector for the facets and the pattern to match ( as in the paper Rémi suggested) so I don't have to have a complex tree like test. Perhaps my original thought of just declaring the site to be a pic and letting the jvm do its best optimizing the target will be the best. ragards mark___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Thanks Rémi, I was looking for a paper like that. Not for multimethods but for a way to improve code reuse across a hierarchy. Will savor it later with a fine pinot :) What I was thinking about for multi methods was a simpler tree like approach. http://dl.acm.org/citation.cfm?id=28732 In my proposed pic the test method handle examines the arguments and returns a behavior token. By using a custom test for that call site I believe that the overhead will be acceptable. But I don't want to recalculate it N times for N guards. Of course I will be able to implement this with just the current MH support. If you want efficiency, you still have the issue that to be efficient you need to inline the code of all method handles inside the code of the caller and you can not do that if there are too many possible implementations i.e. if the callsite is megamorphic. My thoughts around megamorphic sites is to note that they very often have only a few implementations ( like Object at: ). I was thinking about inverting the test logic when a site gets too large and I know there are less than N implementations. The paper you referenced may help here. Currently, if a callsite is monomorphic and you have a way to prove it upfront, you can use a switchpoint Agree, but all sites can be assumed to be such until proven otherwise (80%) It would be nice if a single GWT would be no more expensive. If you have a polymorphic callsite, guards are your savior, it's work pretty well until you have too many guards (it's a linear scan after all). Yes this is my current model If you have a megamorphic callsite, you can use an exactInvoker + a fold, but you basically give up on inlining. This may be a good fallback The other solution is to ask the VM to do a tableswitch on the method handle value (or on an int that represents the method handle value). We have no specific method handle for this case so you will have to generate bytecode. I was going to try this ( I already do my own bytecodes ) so thanks for the reinforcement. My roadblock here is how to convert the 20K behaviors to the int to use for the table switch, it would be very sparse. I think due to the rarity of sites with more than 10 receivers that I may just take the hit of letting them be rebuilt when they pass some size. This follows an observation that often sites which are megamorphic in the long term are not over shorter time spans. So let me try to understand what you want, you want a combiner that will do an inlining cache or a polymorphic inlining cache with guards if there are few possible targets, this combiner will be able re-arrange the guard checks because the user guarantee that the check do no side effect. And if there are too many guards, the combiner will use a tableswitch to still try to inline targets at callsite until there are so many possible targets that the cominer will in that case only use an exactInvoker + a fold. how far am I from what you are thinking ? I think I would be happy with that. But I would also be happy with a simpler building block which satisfies a subset of these requirements with blazing speed. My minimum is a construct which manages a short ( 1 to ?? ) inline set of GWTs and turns them into compare/branch instructions. takes a method handle which when passed the stack args returns an integer to use for the selection so only one execution of the test method if there is a miss or overflow passes to the fallback mh for a lookup can optionally insert the new mh into the front of the managed set or just execute it and discard it ( handles megamorphic or a table lookup) based on the reply from the fallback ( a tuple of int,mh) all paths are assumed fast always inlined the optimizer doing the right thing thanks again mark ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Rémi your suggested paper and comments caused me to take a look at my code base some more. What I found was that for a given selector+arity 93% of them have 5 implementations or less ( across 2000 classes and 25K methods). Combining this with my prior observations that 99% of the call sites have less than 5 receivers implies to me that a 5 element pic with appropriate pattern matching could cover most everything. The key will be how to come up with a pattern (bit vector?) from the args which can be tested quickly. I think I may be able to statically compute this given a 64 bit field as the arg behavior identifier. Thanks again, off to follow the cites. mark fyi 60% had only one implementation___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Am 11.03.2015 15:37, schrieb Remi Forax: [...] Sometime, yes, if you count the frequency of each branch by example, here is another code that implements a bimorphic inlining cache that expand to a dispatch table local to a callsite if necessary, I think you can adapt this code to implement what Mark want quite easily https://code.google.com/p/jsr292-cookbook/source/browse/trunk/bimorphic-cache/src/jsr292/cookbook/bicache/RT.java but obviously the DispatchMap case is not something that will be nice to performance, or not? I am was wondering how performance would drop if I have an array for counting invocations for each part of the cache as well as total number of counts and how that would impact performance. Because obviously on each invocation I would have to call a little method that does the counting as well as an additional guard to cause the reorganization once a certain threshold is reached... And I know every guard will lower performance. Anyway... based on Marks initial post as well as Remi's idea I would come up with the following... which is mostly an abstraction and refactoring of Remi's code. class InlineCacheT extends MutableCallSite { int MAX_DEPTH = 5 int depth T callsite ... InlineCache(T callsite) {...} public void installCache() public final Object internalFallback(...) public abstract MH fallback(T callsite, Object[] args) public abstract MH megamorphicFallback(T callsite, Object[] args) public final MH insert(MH target, MH... tests) } in case of a cache miss and max depth not reached fallback is called and supposed to call insert for the new target, the method will do the integration into the cache, and return a handle you can return from fallback. megamorphicFallback would be called in case of maximum depth is reached and install for example a virtual dispatch or whatever else. installCache would the cache logic as target of the callsite. internalFallback would increase the count as well as calling either fallback or megamorphicFallback. I didn't want to let the cache extend the callsite, since others may need the callsite class for their own purposes. Separation could allow for for example using a different cache type. Taking your code from https://code.google.com/p/jsr292-cookbook/source/browse/trunk/inlining-cache/src/jsr292/cookbook/icache/RT.java basically line 42 would be in internalFallback 43-45 in megamorphicFallback, 46-47 in internalFallback, 50-57 in fallback, 56-57 would be the test given to insert, 59-63 in internalFallback, and of course we would not link to fallback, but to internalFallback What do you guys think? bye Jochen -- Jochen blackdrag Theodorou - Groovy Project Tech Lead blog: http://blackdragsview.blogspot.com/ german groovy discussion newsgroup: de.comp.lang.misc For Groovy programming sources visit http://groovy-lang.org ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
RE: A simple PIC api
so the order of the handles is never changed to for example trying the last one first or the one with the most hits recently. Is it not worth the trouble? We found a useful gain at one point from reordering GWT chains so that the new MH was put on the end of the chain and the chain as a whole was therefore sorted chronologically, with first seen at the top. That obviously requires a bit of book-keeping to arrange. We haven't experimented with maintaining our own hit counts, and in fact I think we last checked the perf gain from reversing the chain on Java 7, so results may no longer be relevant. -Original Message- From: mlvm-dev [mailto:mlvm-dev-boun...@openjdk.java.net] On Behalf Of Jochen Theodorou Sent: 11 March 2015 14:00 To: Da Vinci Machine Project Subject: Re: A simple PIC api Am 11.03.2015 14:46, schrieb Jochen Theodorou: [...] I really should write a PIC implementation for Groovy :( actually picking up on that maybe you guys could give some advice about how to structure a PIC internally. https://code.google.com/p/jsr292-cookbook/source/browse/trunk/inlining-cache/src/jsr292/cookbook/icache/RT.java uses basically guardWithTest, so the order of the handles is never changed to for example trying the last one first or the one with the most hits recently. Is it not worth the trouble? bye Jochen -- Jochen blackdrag Theodorou - Groovy Project Tech Lead blog: http://blackdragsview.blogspot.com/ german groovy discussion newsgroup: de.comp.lang.misc For Groovy programming sources visit http://groovy-lang.org ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
On 03/11/2015 11:12 PM, Mark Roos wrote: From Jochen Do I also understand right, that your test for checking if the current target is still valid is limited to only the receiver? Well yes and no. In my case the test examines all of the arguments on the stack and computes an 'behavior' reference. This reference is the head of a linked list of method dictionaries which the fallback searches to locate the code for this callsite and that behavior. The yes part is that in my Smalltalk the behavior is stored in a field of the object on the top of the stack. This happens to be the receiver. For the proposed api the test can examine the entire stack and callsite to determine the dispatch. I do plan to support multi methods which could be an expensive test. I just wanted to do it once regardless of the number of guards. I also do special things to not add guards if I can, since each guard costs... though the information I use there is static so far I would expect in a perfect world for the guards to be cheap if the jvm knew the were part of a pic. Do you mean you plan to use something like this ? http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.37.594rep=rep1type=pdf the result of the dispatch algorithm will be either a method handle corresponding to the implementation to call. If you want efficiency, you still have the issue that to be efficient you need to inline the code of all method handles inside the code of the caller and you can not do that if there are too many possible implementations i.e. if the callsite is megamorphic. Currently, if a callsite is monomorphic and you have a way to prove it upfront, you can use a SwitchPoint with no guard at all (think something like there is only one class that implement this interface). If you have a polymorphic callsite, guards are your savior, it's work pretty well until you have too many guards (it's a linear scan after all). If you have a megamorphic callsite, you can use an exactInvoker + a fold, but you basically give up on inlining. The other solution is to ask the VM to do a tableswitch on the method handle value (or on an int that represents the method handle value). We have no specific method handle for this case so you will have to generate bytecode. So let me try to understand what you want, you want a combiner that will do an inlining cache or a polymorphic inlining cache with guards if there are few possible targets, this combiner will be able re-arrange the guard checks because the user guarantee that the check do no side effect. And if there are too many guards, the combiner will use a tableswitch to still try to inline targets at callsite until there are so many possible targets that the cominer will in that case only use an exactInvoker + a fold. Currently, this can not be implemented easily for two reasons, - you may not want profile counters that profile the different branches to be included in the JITed code, so you need a way to specify a special method handle that will execute a method handle in the interpreter but that will not appear in JITed code. - you want the VM to be able to generate a tableswitch which is not something the VM is able to do now. how far am I from what you are thinking ? regards mark cheers, Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Hi Jochen uses basically guardWithTest, so the order of the handles is never changed to for example trying the last one first or the one with the most hits recently. Is it not worth the trouble? In my case I always add the new gwt to the head of the chain. This fits my use case where the most recent tends to be the most common at the moment. It was also very easy to code using method handles directly. Not sure how you would do it in java. Also when it gets too deep I just drop the chain and start over. regards mark___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
On 03/11/2015 11:12 PM, Mark Roos wrote: Remi commented I think you can adapt this code to implement what Mark want quite easily I don't disagree that pics are easy to code, my premise is that with a construct such I I proposed the jvm would do a better job of optimizing. Especially taking into account invalidation, multi core memory model and volatile state. You have to be more specific, I don't understand exactly what you mean ? If you don't think the performance gains would be worth the effort then I will defer to you and beat some other drum. :) regards mark cheers, Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
From Jochen Do I also understand right, that your test for checking if the current target is still valid is limited to only the receiver? Well yes and no. In my case the test examines all of the arguments on the stack and computes an 'behavior' reference. This reference is the head of a linked list of method dictionaries which the fallback searches to locate the code for this callsite and that behavior. The yes part is that in my Smalltalk the behavior is stored in a field of the object on the top of the stack. This happens to be the receiver. For the proposed api the test can examine the entire stack and callsite to determine the dispatch. I do plan to support multi methods which could be an expensive test. I just wanted to do it once regardless of the number of guards. I also do special things to not add guards if I can, since each guard costs... though the information I use there is static so far I would expect in a perfect world for the guards to be cheap if the jvm knew the were part of a pic. regards mark___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Remi commented I think you can adapt this code to implement what Mark want quite easily I don't disagree that pics are easy to code, my premise is that with a construct such I I proposed the jvm would do a better job of optimizing. Especially taking into account invalidation, multi core memory model and volatile state. If you don't think the performance gains would be worth the effort then I will defer to you and beat some other drum. regards mark___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
The overhead is negligible in my micro-benchmarks for monomorphic and trimorphic call sites compared to regular Java virtual calls. - Julien On Tue, Mar 10, 2015, at 18:25, Mark Roos wrote: From Julian That being said performance of the PIC construct is very good in our case. Do you have any quantitative data on your call site performance vs number of targets? thx mark ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Hi Rémi, I assume you want me to be more specific about my concerns on: taking into account invalidation, multi core memory model and volatile state. My model is that I have a GWT chain, a cache, and a fallback which is updating the chain, and more than one core using the same callsite. So I have shared mutable state in the callsite holding things like depth, a list of receivers and the current target. It is likely that the two threads could be setting different targets etc. So I worry about syncing access. But that seems like it could be expensive even though the possibility of a collision would be rare. Seems like a place for STM :). I currently lock the callsites mutable parts but I am not sure if I can lock the entire site or if I even should. Obviously most to the time the site can be shared with no issues. Perhaps its enough to lock the fallback? The volatile state question is about the sharing of the call sites across threads. It really ok if the two threads create different caches as the code executed is always correct. But when I change code on the fly the caches all need to update so I need to make the cache volatile. I can do this now with a switch point so its not a big deal. My invalidation concern is that as I update my chain that the jvm decides that I no longer deserve to be inlined. Especially when I drop the entire chain to rebuild it many times. As I use object references for my test comparison I wonder if as they need to be tracked by the GC does this add overhead. I hope these are invalid concerns but if they were not I would think that letting the jvm know this was a pic would allow it to handle them with a minimum loss in performance. regards mark___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Hi Jochen I have to think about yours some more but I thought I would share mine. I condensed it to make it easier to explain. regards mark I extended callsite to hold a bunch of values one of which is the depth. And my model for the cache is up to 10 chained GWTs after that I drop the chain and start over. When I get a miss in the cache off to the fallback ( look below to see how this MH is made) static RtObject fallbackSelf(RtCallSite site, Object... args) For each guard I create a test MH which takes all of the args and returns a bool. Since I only look at the TOS it drops the rest, yours is more complicated. This can be different for each guard in my case. RtObject rcvr=(RtObject)args[args.length - 1]; // top of stack test=MethodHandles.insertArguments(test, 0, rcvr.classField()); // the behavior field test=MethodHandles.dropArguments(test, 0, site.dropArray()); // match test signature then finds the code based on the site and the receiver in this case MethodHandle newTarget=rcvr.lookupSelf(site.getSelector()); makes a new GWT with the existing chain as its slow path if the cache is full will make the bootstrap MH (see below) the target dropping the existing cache if(site._depth 10) MethodHandle gwt=MethodHandles.guardWithTest(test, newTarget, site .getRealTarget()); site._depth = site._depth + 1; // bump the depth inserts it at the start of the chain by making it the new target site.setRealTarget(gwt); and invokes it ( has to spread the args here as they were compressed on the way to the fallback) MethodHandle invoke=newTarget.asSpreader(Object[].class, site .getAirity()); result=(RtObject) invoke.invokeExact((Object[])args); You probably need to see the bootstrap to understand the invoke. To limit myself to only one fallback signature I bind the callsite and collect the args MethodHandle mh=lookup.findStatic(RtCallSite.class, fallbackSelf, mt); MethodHandle mh2=MethodHandles.insertArguments(mh, 0, site); // insert the call site MethodHandle mh3=mh2.asCollector(RtObject[].class, site .getAirity()); // combine the args into an array site.setBootstrapTarget(mh3);___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Am 09.03.2015 18:51, schrieb Mark Roos: I was thinking about a generic pic, easy to use but flexible and came up with the following concept api. By passing the callsite and the testValue around with allowing an optional pic update I can envision this as a nice building block for several caching strategies. Of course this is all based on my use model which does not use primitives. In summary it reduces the calling of the test method to only once and it supports the quick testing of a few targets. Its also easy to understand how to use without using Charlies or Remis builders. Do I also understand right, that your test for checking if the current target is still valid is limited to only the receiver? to generate a PIC MethodHandle createPic(Callsite site, MethodHandle getTestValue, MethodHandle fallback) The getTestValue methodHandle returns the object used in the comparison process Object getTestValue(Callsite site, Object... stackArgs) The fallback returns a testValue and MH to use Object[] fallback(Callsite site, Object testValue, Object... stackArgs) where the return value is [Object newTestValue][MethodHandle newMH] // use case for Tuple if(newTestValue != null) lookup.insert(newTest, newMH) newMh.invoke() This path should also be considered fast as the first step of a fallback could be to check a cache of MH. the lookup process could be a simple chain of compare/branch operations. And the mh replacement could be a simple circular update. This chain could be limited to less than 5. I really should write a PIC implementation for Groovy :( Though it won't be as simple as this one. I need to check for arguments which can be primitives) and receiver as well as meta class (switchpoint for that one) So not only do I have to check more things, I also do special things to not add guards if I can, since each guard costs... though the information I use there is static so far. bye Jochen -- Jochen blackdrag Theodorou - Groovy Project Tech Lead blog: http://blackdragsview.blogspot.com/ german groovy discussion newsgroup: de.comp.lang.misc For Groovy programming sources visit http://groovy-lang.org ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
Am 11.03.2015 14:46, schrieb Jochen Theodorou: [...] I really should write a PIC implementation for Groovy :( actually picking up on that maybe you guys could give some advice about how to structure a PIC internally. https://code.google.com/p/jsr292-cookbook/source/browse/trunk/inlining-cache/src/jsr292/cookbook/icache/RT.java uses basically guardWithTest, so the order of the handles is never changed to for example trying the last one first or the one with the most hits recently. Is it not worth the trouble? bye Jochen -- Jochen blackdrag Theodorou - Groovy Project Tech Lead blog: http://blackdragsview.blogspot.com/ german groovy discussion newsgroup: de.comp.lang.misc For Groovy programming sources visit http://groovy-lang.org ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
How is it different from Rémi's construct? https://code.google.com/p/jsr292-cookbook/source/browse/trunk/inlining-cache/src/jsr292/cookbook/icache/RT.java (we use that pattern in Golo) - Julien On 09 Mar 2015, at 18:51, Mark Roos mr...@roos.com wrote: I was thinking about a generic pic, easy to use but flexible and came up with the following concept api. By passing the callsite and the testValue around with allowing an optional pic update I can envision this as a nice building block for several caching strategies. Of course this is all based on my use model which does not use primitives. In summary it reduces the calling of the test method to only once and it supports the quick testing of a few targets. Its also easy to understand how to use without using Charlies or Remis builders. to generate a PIC MethodHandle createPic(Callsite site, MethodHandle getTestValue, MethodHandle fallback) The getTestValue methodHandle returns the object used in the comparison process Object getTestValue(Callsite site, Object... stackArgs) The fallback returns a testValue and MH to use Object[] fallback(Callsite site, Object testValue, Object... stackArgs) where the return value is [Object newTestValue][MethodHandle newMH] // use case for Tuple if(newTestValue != null) lookup.insert(newTest, newMH) newMh.invoke() This path should also be considered fast as the first step of a fallback could be to check a cache of MH. the lookup process could be a simple chain of compare/branch operations. And the mh replacement could be a simple circular update. This chain could be limited to less than 5. regards mark ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
From Julian How is it different from Rémi's construct? Performance would be the hope. My position has been that with a decent pic api the jvm would be able to optimize the pic to a few test/branch instructions for the large majority of callsites. For your use case do you have the same situation as I do with 99%+ of call sites having less than 5 targets? regards mark___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: A simple PIC api
For your use case do you have the same situation as I do with 99%+ of call sites having less than 5 targets? Yes. We set the PIC-megamorphic threshold to 5 in Golo, in which case we degrade to a stable but slower map-based + invoker dispatch. That being said performance of the PIC construct is very good in our case. - Julien ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev