We have been searching for a Rails deployment architecture which works for us for some time. We've recently moved from Apache 1.3 + FastCGI to Apache 2.2 + mod_proxy_balancer + mongrel_cluster, and it's a significant improvement. But it still exhibits serious performance problems.
We have the beginnings of a fix that we would like to share. To illustrate the problem, imagine a 2 element mongrel cluster running a Rails app containing the following simple controller: class HomeController < ApplicationController def fast sleep 1 render :text => "I'm fast" end def slow sleep 10 render :text => "I'm slow" end end and the following test app #!/usr/bin/env ruby require File.dirname(__FILE__) + '/config/boot' require File.dirname(__FILE__) + '/config/environment' end_time = 1.minute.from_now fast_count = 0 slow_count = 0 fastthread = Thread.start do while Time.now < end_time do Net::HTTP.get 'localhost', '/home/fast' fast_count += 1 end end slowthread = Thread.start do while Time.now < end_time do Net::HTTP.get 'localhost', '/home/slow' slow_count += 1 end end fastthread.join slowthread.join puts "Fast: #{fast_count}" puts "Slow: #{slow_count}" In this scenario, there will be two requests outstanding at any time, one "fast" and one "slow". You would expect approximately 60 fast and 6 slow GETs to complete over the course of a minute. This is not what happens; approximately 12 fast and 6 slow GETs complete per minute. The reason is that mod_proxy_balancer assumes that it can send multiple requests to each mongrel and fast requests end up waiting for slow requests, even if there is an idle mongrel server available. We've experimented with various different configurations for mod_proxy_balancer without successfully solving this issue. As far as we can tell, all other popular load balancers (Pound, Pen, balance) behave in roughly the same way. This is causing us real problems. Our user interface is very time-sensitive. For common user actions, a page refresh delay of more than a couple of seconds is unacceptable. What we're finding is that if we have (say) a reporting page which takes 10 seconds to display (an entirely acceptable delay for a rarely-used report) then our users are seeing similar delays on pages which should be virtually instantaneous (and would be, if their requests were directed to idle servers). Worse, we're occasionally seeing unnecessary timeouts because requests are queuing up on one server. The real solution to the problem would be to remove Rails' inability to handle more than one thread. In the absence of that solution, however, we've implemented (in Ruby) what might be the world's smallest load-balancer. It only ever sends a single request to each member of the cluster at a time. It's called HighWire and is available on RubyForge (no Gem yet - it's on the list of things to do!): svn checkout svn://rubyforge.org/var/svn/highwire Using this instead of mod_proxy_balancer, and running the same test script above, we see approximately 54 fast and 6 slow requests per minute. HighWire is very young and has a way to go. It's not had any serious optimization or testing, and there are a bunch of things that need doing before it can really be considered production ready. But it does work for us, and does produce a significant performance improvement. Please check it out and let us know what you think. -- paul.butcher->msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? MSN: [EMAIL PROTECTED] AIM: paulrabutcher Skype: paulrabutcher LinkedIn: https://www.linkedin.com/in/paulbutcher _______________________________________________ Mongrel-users mailing list Mongrel-users@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-users