On Sun, Jul 26, 2009 at 3:26 AM, Subramanya Sastry<sss.li...@gmail.com> wrote: > I have been trying to lay down clearly all the dynamic features of Ruby so > that I understand this better. Please add/correct if I have understood this > incorrectly > > 1. Open classes: This is the well known case where you can modify classes, > add methods, redefine methods at runtime. This effectively means that > method calls cannot be resolved at compile time. The optimization for > improved performance is to optimistically assume closed classes but then > have solid mechanisms to back out in some way (either compile-time guards or > run-time invalidation detection & invalidation).
Hotspot will synthesize types based on runtime profiling for things like interfaces so that it can internally perform static dispatch (or so I have been told). They also have guards to deopt if their type assumptions are wrong. We could do something similar since as others have noted at some point Ruby classes hit an unchanging state in 99% of all applications (yes, I made that number up but I will stand by it :) ). > 2. Duck typing: This is also the well known case where you need not have > fixed types for method arguments as long as the argument objects can respond > to a message (method call) and meet the message contract at the time of the > invocation (this could include meeting the contract via dynamic code > generation via method_missing?). This means that you cannot statically bind > method names to static methods. The optimization for improved performance > is to rely on profiling and inline caching. Same comment for this one too... > 3. Closures: This is where you can create code blocks, store them, pass them > around, and invoke them. Supporting this requires allocating heap frames > that captures the ennvironment and keeps it around for later. The > optimization for improved performance includes (a) lazy frame allocation > (on-demand on call paths where closures are encountered) (b) only allocating > frame space for variables that might be accessed later (in some cases, this > means all variables) (c) inlining the target method and the closure and > eliminating the closure altogether [ using a technique in one of my early ir > emails ] (d) special case optimizations like the cases charlie and yehuda > have identified. c is probably the grail for good closure performance. The cases for whether you can or need to keep variables around is probably the most interesting discussion. I think it could be broken into at least one thread by itself. I think collectively we can identify many special cases where we don't need to capture all variables. Of course that assume we can track changes to target classes method. If it changes for some reason we need to be able to deopt. > 4. Dynamic dispatch: This is where you use "send" to send method messages. > You can get improved performance by profiling and inline caching techniques. If we can easily determine that send is really send then we can just replace the send with a regular dispatch. My bigger question is can we figure out whether these things (send, eval, binding) are actually the nasty methods rather than just assuming anything with these names are the nasty methods? I know the answer is yes, but I think internally we should have some systemic way of tracking dangerous entities so we have one framework to use for various optimizations.... > 5. Dynamic code gen: This is the various forms of eval. This means that > eval calls are hard boundaries for optimization since they can modify the > execution context of the currently executing code. There is no clear way I > can think of at this time of getting around the performance penalties > associated with it. But, I can imagine special case optimizations including > analyzing the target string, where it is known, and where the binding > context is local. Optimizing eval is probably not an immediate concern. Most places which eval generally do it to create a new method or at least to generate something. Tight loops of evals does not seem all that common to me. Just the cost of runtime parsing of eval code makes optimizing the execution side of eval not so important. I guess I would just do no optimizations in this case (though I am sure there are smaller cases where we can do something). > 6. Dynamic/Late binding: This is where the execution context comes from an > explicit binding argument (proc, binding, closure). This is something I was > not aware of till recently. Yucky stuff. There are some times when we can probably optimize based on knowing how the binding/proc/closure is used. As Charlie notes, AOT-defined Ruby methods we fully understand like core method impls can be optimized. I think we can even make arbitrary ruby methods optimize by indicating on compilation whether they do wacky stuff. Some chicken and egg stuff in my mind....we need to know that a block will be passed to a method and that that method is not using the block in a strange way before creating the block itself. > > Many performance problems and optimization barriers come about because of a > combination of these techniques. > > Consider this code snippet: > > ----- > def foo(m,expr) > a = 1 > b = 2 > m.send(m, expr) > puts "b is #{b}" > end > > foo("puts", "b=a+3") # outputs b=a+3\n b is 2 > foo("eval", "b=a+3") # outputs b is 4 > ----- > > This code snippet combines dynamic dispatch and dynamic code gen (send + > eval). The net effect is that all sends where the target cannot be > determined at compile time become hard optimization barriers just like > eval. Before the send you have to dump all live variables to stack/heap, > and after the send, you have to restore them back from the stack/heap. In > addition, you also have to restore all additional variables that the eval > might have created on the stack/heap. > > One way around is to use different code paths based on checking whether the > send target is eval or not. > > Now, consider this code snippet: > ------ > def foo(n,x) > proc do > n+1 > end > end > > def bar(i) > proc do > t = foo(i, "hello") > send("eval", "puts x, n", t) > end > end > > delayed_eval_procs = (1..10).collect { |i| bar(i) } > ... go round the world, do things, and come back ... > delayed_eval_procs.each { |p| p.call } > ------ > > This is a contrived example, but basically this means you have to keep > around frames for long times till they are GCed. In this case > delayed_eval_procs keeps around a live ref to the 20 frames created by foo > and bar. > > While the examples here are contrived, since there is no way to "ban" them > from ruby, the compilation strategies have to be robust enough to be > correct. > > I haven't invested aliasing yet ... but, I suspect they introduce further > challenges. > > Subbu. > -- Blog: http://www.bloglines.com/blog/ThomasEEnebo Email: en...@acm.org , tom.en...@gmail.com --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email