On 8/7/07, Frederick Cheung <[EMAIL PROTECTED]> wrote:
> Executive Summary:
> ================
>
> I've recently come across some performances problems with eager
> loading multiple has_many relationships (and to a lesser extent a
> single has_many with many objects in the collection) and had some
> thoughts.
>
> The case I came across involved some models like so:
>
> class Question < ActiveRecord::Base
>    has_many :incoming_messages
>    has_many :outgoing_messages
> end
>
> class IncomingMessage < ActiveRecord::Base
>    belongs_to :question
> end
>
> class OutgoingMessage < ActiveRecord::Base
>    belongs_to :question
> end
>
> In various parts of the app we load a question (or multiple ones)
> with :include => [:incoming_messages, :outgoing_messages]
>
> Typically a question has  a small number of incoming and outgoing
> messages (often only 1 or 2) and this all works absolutely fine.
> However at some point we ended up with a question with many incoming
> and outgoing_messages. Our servers (quite literally) ground to a halt
> whenever loading that question with the aforementioned includes, so I
> had a look under the hood.
>
> The underlying thing is that in this case Question.find(1, :include
> => [:incoming_messages, :outgoing_messages]) returns quite a few rows
> and so even fairly small things add up very quickly
>
> I've put together some changes that improve the situation, along with
> some numbers
>
> Numbers:
> ===========
>
> In my benchmarks I've used 2 instances of Question: one with 150
> incoming and 80 outgoing (big question) and one with 225 incoming and
> 120 outgoing (huge question) (ie 50% more of each, so total row count
> goes up by 2.25)

The main issue you are running into is that Rails' SQL queries for
multiple included has_many associations return the cartesian product
of the has_many_associations.  Ideally, the best way to handle this is
to send two or three separate SQL statements.  You'd have one
statement for each association, and then combine them together.  The
most efficient way is probably n+1 queries where n is the number of
has_many associations, with one query to get the information on the
main object, and one query for each has_many association, that only
includes the association information and the main object's id (in
order to associate it).  That would shorten the number of rows
returned for the queries you mention from 12,000 to 231 and from 27000
to 346.  It's more complex than the current implementation, but it
will preform much better.  I'm not volunteering to implement it,
though. :)

As a workaround, how about:

question = Question.find_by_id(big_question.id, :include => :incoming_messages)
question.instance_variable_set('@outgoing_messages',
Question.find_by_id(big_question.id, :include =>
:outgoing_messages).outgoing_messages)

Also, note that for a single object, you are probably better off using
lazy loading has_many associations (eager loading belongs_to
associations is fine).    Eager loading has_many associations should
only be done if you are getting multiple objects at once (i.e. find
:all).

Jeremy

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Core" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-core?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to