I've been doing some view rendering profiling tonight. Here are my findings. Note these are Java 5 OS X numbers, because it's what I could get a profiler to attach to. Java 6 will be faster, and I'll show it at the end.

Starting best time: 0.830000   0.000000   0.830000 (  0.829000)
Ruby 1.8.6 time:    0.510000   0.020000   0.530000 (  0.537098)
Ruby 1.8.6 time:    0.500000   0.010000   0.510000 (  0.508733)

The top entry when profiling is set up to only profile JITed Ruby code is compile_template? in Rails. This function basically tries to determine whether it should reload/recompiler a target view template by doing...you guessed it (Tom), a stat call (specifically lstat). By adding config.action_view.cache_template_loading = true that drops off the profile. Note that turning off native code (i.e. JNA, which we use to support stat) did not appear to have much affect on performance.

Tom gets a "told you so" here, but I always agreed there's probably a stat happening in the view pipeline.

New best time: 0.714000   0.000000   0.714000 (  0.714000)

The next item to show up in profiling was rails_asset_id, which calculates the asset_id tag for a given resource. Again, it's native calls involved here:

def rails_asset_id(source)
  ENV["RAILS_ASSET_ID"] ||
    File.mtime("#{RAILS_ROOT}/public/#{source}").to_i.to_s rescue ""
end

I'm not sure why they recalculate this every time, so to get around it we'll specify a RAILS_ASSET_ID in our environment.

New best time: 0.631000   0.000000   0.631000 (  0.630000)

The next highest entry in profiling is an unusual one: mon_exit, which is defined in monitor.rb, which is used exclusively for one purpose: synchronized logging. WTF! Logging uses monitor synchronization?!? Ok, easy solution, turn off logging, and prepare to bitchslap whoever thought that was a good idea (ok, so there are good reasons for it, but in Rails we've presumably only got one thread logging anyway). In our case, since we need logging to report numbers, we'll leave it on but disable the use of monitor. Presumably, since logger uses monitor, we'll want to wire up our own JRuby-smart version. Have a look at monitor.rb and fear.

New best time: unchanged, but more consistently close to 0.63

At this point, the largest items in the profile appear to be actual template rendering code. We'll pause here for now, since we can benchmark rendering time separately and we've made progress already on view rendering.

So then I promised Java 6 numbers. Here's numbers with Java 6 and the changes above, and Ruby 1.8.6 with the RAILS_ASSET_ID env var and template caching (I didn't monkey with the monitor stuff because monitor has nearly zero overhead for MRI):

Java 6 time "soylatte"): 0.484000   0.000000   0.484000 (  0.484000)
Ruby 1.8.6 time:         0.490000   0.000000   0.490000 (  0.499294)

Yay! We win!

So these really simple tweaks bring us to parity on view rendering. From here we can only improve, by making sure appropriate code is jitting well, adding optimizations for bottlenecks we find in the rendering pipeline, and so on. No worries.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email


Reply via email to