On Sat, Jul 25, 2009 at 7:26 PM, Subramanya Sastry <sss.li...@gmail.com>wrote:
> I have been trying to lay down clearly all the dynamic features of Ruby so > that I understand this better. Please add/correct if I have understood this > incorrectly > > 1. Open classes: This is the well known case where you can modify classes, > add methods, redefine methods at runtime. This effectively means that > method calls cannot be resolved at compile time. The optimization for > improved performance is to optimistically assume closed classes but then > have solid mechanisms to back out in some way (either compile-time guards or > run-time invalidation detection & invalidation). Add the ability to include modules at runtime, which has peril but promise. Modules get inserted in the hierarchy of a class, which means that they effectively become a new class. However, you can add modules directly onto any object at runtime (just as you can add methods directly onto a single object). This means that simple class caching can't work, since an object can have different methods than its class. However, in the case of modules, it is hypothetically possible to create shadow classes that represent a class + specific collections of modules. > 2. Duck typing: This is also the well known case where you need not have > fixed types for method arguments as long as the argument objects can respond > to a message (method call) and meet the message contract at the time of the > invocation (this could include meeting the contract via dynamic code > generation via method_missing?). This means that you cannot statically bind > method names to static methods. The optimization for improved performance > is to rely on profiling and inline caching. AKA polymorphic dispatch. In Ruby, it is hypothetically possible to determine certain details at compile time (for instance, methods called on object literals). In general though, the idea of determining before runtime what method will be called is a fool's errand--there are simply too many commonly used features that can change these semantics. However--as I have pointed out to Charlie a number of times--in practice, classes are basically frozen after *some* time. In Rails, pretty much all classes reach their final stage at the end of the bootup phase. However, since JRuby only sees a parse phase and then a generic "runtime" it's not possible for it to determine when that has happened. I personally would be willing to give a guarantee to Ruby that all classes are in a final state. This is actually possible in Ruby right now via: ObjectSpace.each_object(Class) {|klass| klass.freeze} ObjectSpace.each_object(Module) {|mod| mod.freeze} It should be possible to completely eliminate the method cache check in JRuby for frozen classes (if all of their superclasses are also frozen), and treat all method calls as entirely static. An interesting side-note is that most methods are JITed only *after* the boot phase is done, and it should also be possible to have a mode that only JITed frozen classes (to apply some more aggressive optimizations). > 3. Closures: This is where you can create code blocks, store them, pass > them around, and invoke them. Supporting this requires allocating heap > frames that captures the ennvironment and keeps it around for later. The > optimization for improved performance includes (a) lazy frame allocation > (on-demand on call paths where closures are encountered) (b) only allocating > frame space for variables that might be accessed later (in some cases, this > means all variables) (c) inlining the target method and the closure and > eliminating the closure altogether [ using a technique in one of my early ir > emails ] (d) special case optimizations like the cases charlie and yehuda > have identified. There are some additional closure perils. For one, once you have captured a block, you can eval a String into the block, which gives you access to the entire closure scope, including variables that are not used in the closure. As Charlie pointed out earlier, however, this can only happen if you actually capture the block in Ruby code. Otherwise, this behavior is not possible. You can also do things like: def my_method [1,2,3].each { yield } end which yields the block passed into my_method, and def my_method [1,2,3].each {|x| return if x == 2 } end which returns from my_method. You can also alter the "self" of a block, while maintaining its closure, which should not have any major performance implications. 4. Dynamic dispatch: This is where you use "send" to send method messages. > You can get improved performance by profiling and inline caching techniques. > The most common use of send is send(:literal_symbol). This is used to get around visibility restrictions. If it was possible to determine that send was actually send (and not, for instance, redefined on the object), you could treat send with a literal Symbol or String as a literal method invocation without visibility checks. It would be possible to apply this optimization to frozen classes, for instance. I also discussed doing a full bytecode flush whenever people do stupid and very unusual things (like aliasing a method that generates backrefs, or overriding eval or send). > 5. Dynamic code gen: This is the various forms of eval. This means that > eval calls are hard boundaries for optimization since they can modify the > execution context of the currently executing code. There is no clear way I > can think of at this time of getting around the performance penalties > associated with it. But, I can imagine special case optimizations including > analyzing the target string, where it is known, and where the binding > context is local. This is extremely common, but mainly using the class_eval and instance_eval forms. These forms are EXACTLY equivalent to simply parsing and executing the code in the class or instance context. For instance: class Yehuda end Yehuda.class_eval <<-RUBY def omg "OMG" end RUBY is exactly equivalent to: class Yehuda def omg "OMG" end end As a result, I don't see why there are any special performance implications associated. There is the one-time cost of calculating the String, but then it should be identical to evaluating the code when requiring a file. 6. Dynamic/Late binding: This is where the execution context comes from an > explicit binding argument (proc, binding, closure). This is something I was > not aware of till recently. This is only present when using eval, and it would be absolutely acceptable to make this path significantly slower if it meant any noticable improvement in the rest of the system. > Many performance problems and optimization barriers come about because of a > combination of these techniques. > > Consider this code snippet: > > ----- > def foo(m,expr) > a = 1 > b = 2 > m.send(m, expr) > puts "b is #{b}" > end > > foo("puts", "b=a+3") # outputs b=a+3\n b is 2 > foo("eval", "b=a+3") # outputs b is 4 The truth is that send itself is rather uncommon, and when it occurs it is almost always with a Symbol or String literal. If you just did a pure deopt in the case of send with a dynamic target, you'd get a lot of perf in MOST cases, and the same exact perf in a few cases. Sounds like a win to me. > > ----- > > This code snippet combines dynamic dispatch and dynamic code gen (send + > eval). The net effect is that all sends where the target cannot be > determined at compile time become hard optimization barriers just like > eval. Before the send you have to dump all live variables to stack/heap, > and after the send, you have to restore them back from the stack/heap. In > addition, you also have to restore all additional variables that the eval > might have created on the stack/heap. Here's an example of an actual use-case in Rails: def helper_method(*meths) meths.flatten.each do |meth| _helpers.class_eval <<-ruby_eval, __FILE__, __LINE__ + 1 def #{meth}(*args, &blk) controller.send(%(#{meth}), *args, &blk) end ruby_eval end end This may seem insane at first glance, but there are a number of mitigating factors that make this easy to optimize: - The eval happens once. This method simply provides parse-time declarative features to Rails controllers. You can think of helper_method as a parse-time macro that is expanded when the class is evaluated. - The send actually isn't dynamic at all. If you call helper_method(:foo), that send gets expanded to: controller.send(%(foo), *args, &blk), which is a String literal and can be compiled into a method call without visibility check. > One way around is to use different code paths based on checking whether the > send target is eval or not. That can't work if you have a send to an unknown target, but that case is extremely uncommon, and again, if you can make everything else faster unless you have a send(foo), it's well worth it. > > > Now, consider this code snippet: > ------ > def foo(n,x) > proc do > n+1 > end > end > > def bar(i) > proc do > t = foo(i, "hello") > send("eval", "puts x, n", t) > end > end > > delayed_eval_procs = (1..10).collect { |i| bar(i) } > ... go round the world, do things, and come back ... > delayed_eval_procs.each { |p| p.call } > ------ > > This is a contrived example, but basically this means you have to keep > around frames for long times till they are GCed. In this case > delayed_eval_procs keeps around a live ref to the 20 frames created by foo > and bar. However, the only case where you care about the backref information in frames (for instance), means that you only care about the LAST backref that is generated, which means that you only need one slot. Are you thinking otherwise? If so, why? > While the examples here are contrived, since there is no way to "ban" them > from ruby, the compilation strategies have to be robust enough to be > correct. Considering that they're so rare, it's ok to do extreme deopts to take care of them. > I haven't invested aliasing yet ... but, I suspect they introduce further > challenges. I think that aliasing dangerous methods happens so rarely that flushing all of the bytecode in that case is an acceptable deopt. > > > Subbu. > -- Yehuda Katz Developer | Engine Yard (ph) 718.877.1325