[jvm-l] Re: Dynlang caching strategies

Jochen Theodorou Tue, 09 Sep 2008 15:29:05 -0700

Charles Oliver Nutter schrieb:
[...]
> JRuby has had an inline cache for over a year, and *actively* flushes 
> call sites.

We had in Groovy a per class cache for method lookups. But the cache was 
too slow (the generation of the key did cost too much) and if we have 
inline caches we don't need that kind of caches. Now I don't what kind 
of relation your cache uses. I guess you use a (name,argument number) 
type of key and then react based on the receiver in the general case and 
maybe a precalculated number in special cases. If that is right, then of 
course you have to check all caches if a class changes... combined with 
synchronization this looks like something that could be better if 
someone had a bright idea

> We keep references to them in a JRuby-runtime-global map, 
> and when changes happen to the hierarchies from which they came we tell 
> all sites known to have cached that method to flush themselves. This 
> works very well so far, but there's a very small chance that under high 
> concurrent load, if a method is simultaneously being called by one 
> thread and redefined by another a cache could get stuck with a stale 
> entry and potentially never flushed. I have not been able to produce 
> this effect experimentally and it has never been reported, so we have 
> not made substantial efforts to replace the active flushing logic.

hehe... we once had a bug report that had its cause in a method object 
(a custom object wrapping a reflection method object) existing but not 
being intialized yet... that was a pretty tough one to debug, because of 
course we where not able to reproduce the problem, because it depends on 
too many factors that are not equal on the developer machine. NBut maybe 
class changes do not appear rapidly enough.

> Are you using a version number now? We have attempted to install a 
> version number but it's complicated by several factors:

we use it in some cases, yes, but there are factors in Groovy that allow 
that compared to Ruby maybe.

> Given class X from which we cached the method:
> - A change anywhere in X or its superclasses must trigger X's version to 
> increment.
> - A change anywhere in any modules included into X or any of its 
> superclasses must trigger X's version to increment.
> 
> These two combined mean we have to have both downreferences from parent 
> to child (necessarily weak references) as well as backreferences from 
> included modules to the pseudoclasses used to mix them in.

I see the difference.... in Groovy a super class change may not affect 
childs. We thought about changing that, but it is not done yet... not 
really at last. Anyway... I guess you need references from the childs to 
the parents and you need to store method and the receiver class... which 
goes well for a call site cache, but maybe not for your inline cache... 
but I guess that is nothing that can't be broken through. Anyway, 
instead of using the method's class, the receiver pseudo class needs to 
be used. And to check the version number the pseudo class needs to know 
also the version number of its parent and if a version check is done, 
then it needs to check the parent too... of course that means that each 
call a lot of version checks are done... which is probably bad and a bit 
complicated.

> An alternative, which I implemented as an experiment, is to instead 
> calculate the version each time, compositing version information 
> recursively through the hierarchy. If the per-class cost is low enough, 
> this does improve performance slightly over a raw search for the method 
> in all superclasses' method maps. But it doesn't appear to help enough 
> to warrant the complexity.

you mean the way I just suggested here? Hmm...

>> For example in a call site cache, you can throw the method out, as soon 
>> as you revisit the cache. If the method is cached in a memory sensitive 
>> way, then there is no need to do that more early I think. If your cache 
>> happens to be in the class itself, then invalidating the cache as soon 
>> as the class is mutated might be an idea. If you have some kind of 
>> inline cache, then you usually have something based on method names and 
>> the cache has some kind of structure that reacts to what class is in 
>> use. Well... I would say that once you found your case you have to check 
>> the version again.
> 
> The benefit of throwing it out early is reducing the guard cost at the 
> site itself. If we can actively clean out sites there's no need to 
> re-check anything other than incoming object type == previously cached 
> object type.

true

> We have been looking at various strategies, including per-class caches, 
> global caches, class serial/version numbers, and so far have been 
> thwarted trying to implement them due to the extreme mutability of 
> Ruby's class hierarchy.

hmm... I just had a new idea, but it involves to create many objects and 
I am not sure this helps here much... anyway.. you wrote:

>> Given class X from which we cached the method:
>> - A change anywhere in X or its superclasses must trigger X's version to 
>> increment.
>> - A change anywhere in any modules included into X or any of its 
>> superclasses must trigger X's version to increment.

ok let me try to develop a new idea here.... let us say that each time 
you mutate a class in any way you create a new class an the old class 
will be flagged as dirty. Let us also say that there is some kind of 
ClassHandle that stores a reference to the current class. Now a super 
class may have a listener like structure for each child class. If the 
super class changes it notifies each child by marking them as dirty and 
then marks itself as dirty too... I guess this needs to be synchronized. 
Anyway, the handle is constant and points now to a dirty class. If you 
do any action involving this class, then you need to create the class 
first new to get a version that is not dirty. his new version will then 
again register itself at the parent and update the method handle. In the 
inline cache you store the tuple receiver class (the goal of the method 
handle) the method handle, as well as the  handle. Then you would have 
to check if handle.reference==receiverClass && 
!handle.reference.isDirty(). If the update did already happen, then that 
part of the cache is no longer called. I don't know if you have timeouts 
for your caches or if you keep track which branches has been called to 
ensure the cache does not blow up... or you change the order to 
!handle.reference.isDirty() && handle.reference==receiverClass and in 
case of isDirty you change the cache. method objects do not use the real 
class, instead they use the handle, meaning that you don't need to 
recreate them when a class changes. Well I guess that is not really 
different from the version idea since I now use a reference instead of a 
number, but it should be interesting to know how this affects the 
performance. the advantage is that you never need to go through all 
caches and update them. You also do not need to update classes that are 
not used. The disadvantage is that you still have synchronization a lot.

>> What is bad about the version idea? In a multi threaded environment you 
>> have the problem, that each time you ask for the class version you have 
>> to access synchronized data... There are possible ways around it if you 
>> have thread locals that get informed about class changes in a event like 
>> system. A bit lame probably ;)
> 
> I think a volatile version number would be sufficient, and only updating 
> the version number would need to be synchronized.

yes. I included that in synchronization.

> There's no reason 
> accessing the version number would have to be synchronized since field 
> access is an atomic operation.

field access yes, object creation not. So in case of a primitive number 
there is of course no problem ;) Still it is not a CPU cache friendly 
version, requiring memory synchronization for each method call.

bye blackdrag

-- 
Jochen "blackdrag" Theodorou
The Groovy Project Tech Lead (http://groovy.codehaus.org)
http://blackdragsview.blogspot.com/
http://www.g2one.com/

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to jvm-languages@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

[jvm-l] Re: Dynlang caching strategies

Reply via email to