Gabe,

This is a lengthy response, sorry to those not interested :-)

I guess the issue I have with your destructive patch is that it  
hobbles existing behavior which (for me) returns a predictable dataset.

I've always known that :conditions which relate to :included  
associations can cause only part of the association's data to be  
loaded (I think that's been referred to as the mephisto bug).  When  
I've used it in the past it's been because that's exactly what I want.

Setting aside the performance issues (which you address in your non- 
destructive patch), the problem as I see it is that there is no way  
to tell find() that you want certain conditions applied in the  
prequery and a *different* set of conditions applied in the data- 
loading query (which I'll call 'realquery' from now on).

For the times when I have wanted *all* of the association's data yet  
wanted to limit the top-level selection based on association  
conditions I've done my own prequery (with a limit) to grab the id  
values I want to present.  I wrote a plugin that makes these  
prequeries a little easier to generate - http://svn.protocool.com/ 
public/plugins/values_of

# give me a list of the first 10 live album ids that have errored songs
ids = Album.values_of("distinct albums.id",
     :joins => 'inner join songs on songs.album_id = albums.id',
     :conditions => 'albums.deleted_at is null and songs.errored =  
1', :limit => 10) {|string| string.to_i}

# present those albums and *all* of their songs/artists
Album.find(ids, :include => [:songs, :artists])

That allows me to have 2 sets of :conditions - one for the prequery  
and one for the realquery - and guarantees that my constraints are  
always true for my returned data.

So... what I'm thinking is this: what about adding a new option to  
find() that allows you to specify 'prequery-only' conditions?  For  
example, something like a :limit_conditions option that utilized your  
non-destructive patch?

* The :limit_conditions option would only be valid in conjunction  
with :limit.
* Both :limit_conditions *and* :conditions options would be  
applicable to the prequery.
* Your non-destructive patch would determine the minimum joins  
required in the prequery based on :limit_conditions and :conditions
* The :conditions would only be applied to the realquery

Those rules for which conditions apply and when are to make sure that  
the constraints you specify in your find() is guaranteed 'true' for  
all of the data you get back:

# live albums with errored songs, and only those errored songs
# as in, current behaviour with your non-destructive patch
Album.find(:all, :limit => 10, :include => [:songs, :artists],
   :conditions => 'albums.deleted_at is null and songs.errored = 1')
# prequery joins songs and applies condition 'albums.deleted_at is  
null and songs.errored = 1'
# realquery joins songs, artistst and applies condition  
'albums.deleted_at is null and songs.errored = 1'

# live albums with errored songs, but I want *all* songs
Album.find(:all, :limit => 10, :include => [:songs, :artists],
   :limit_conditions => 'albums.deleted_at is null and songs.errored  
= 1')
# prequery joins songs and applies condition 'albums.deleted_at is  
null and songs.errored = 1'
# realquery joins songs, artists but applies no conditions

Now, this last example is probably where users might get confused  
about the interaction between :limit_conditions and :conditions - but  
the interaction is required in order to guarantee that you get back  
*exactly* what you specified in your single call to find().

# live albums with errored songs and artists named dave, all songs  
but the artists relation only has artists called 'Dave'
Album.find(:all, :limit => 10, :include => [:songs, :artists],
   :limit_conditions => 'albums.deleted_at is null and songs.errored  
= 1',
   :conditions => ['artists.name = ?', "Dave")
# prequery joins songs,artists applies condition 'albums.deleted_at  
is null and songs.errored = 1 and artists.name = 'Dave''
# realquery joins songs,artists and applies condition 'artists.name =  
'Dave''

It's not ideal, and probably a little difficult for newbies to  
understand, but at least it's *consistent*.

Ultimately I have to admit that when performance is a concern or the  
relations are hairy I will probably continue to reach for values_of()  
and another (soon to be released) plugin that hydrates associations  
with subsequent queries rather than with big-honking-joins.

Sorry for rambling, I hope I've not strayed too far from the original  
reasons you started talking about your problems...

Trev

On 13-Sep-07, at 6:48 PM, Gabe da Silveira wrote:

>
>
> On 9/13/07, Trevor Squires <[EMAIL PROTECTED]> wrote:
>
> Thanks for the reply, I believe I understand now.
>
> So the issue is that once it decides that *any* of the :include
> tables are referenced in :conditions it joins *all* of them - even
> though it's possible that some of the includes are not referenced.
>
> Personally I've not been bitten by performance issues with the
> prequery but if your non-destructive-patch improves that situation
> I'd say it was a good idea.
>
> But what do you think of the destructive patch?  In your example  
> statement below I don't like the fact that the conditions string is  
> used both to A) find albums with at least one errored song upload  
> and B) fetch only the errored uploads.
>
> If you just wanted A then you have to do an explicit join.  If you  
> wanted B you would be better off defining a custom association with  
> that condition directly attached.  With the existing code A&B are  
> conflated, which I think greatly diminishes the utility of adding  
> eager joins to the pre-query.
>
> Is your patch intelligent enough to cope with this?:
>
> Album.find(:all, :limit => 10, :include => [{:songs
> => :uploads}, :artists],
>    :conditions => 'uploads.errored = 0')
>
> Yes, it is just comparing the join string when the query is  
> compiles, doesn't matter what kind of nested includes you use.
>
> where the prequery joins songs and uploads but not artists?
>
> I bring it up because that strikes me as a reason why it just does
> all of the joins - bit of a pain to tease apart exactly which joins
> are absolutely required (not impossible - just a pain).
>
> Check the patch, it's actually quite simple.  The  
> include_eager_order? and include_eager_conditions? methods already  
> parse their respective strings for table names, so I just  
> refactored those methods to capture the table names in arrays.
>
> Then I add one reject method in the method chain that adds the  
> joins.  Essentially there is only half a line of new code required  
> to do this.  But again, this patch is my second choice.  I'm trying  
> to drum up support for the first patch.
>
> >


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Core" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-core?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to