Thanks to the guidance of an answer submitted by LucaM on the
StackOverflow post, I was able to come up with a SQL query and
ActiveRecord find query that does it.
SQL:
SELECT c.park_id, a.name FROM parks a, features b, features_parks c
WHERE (c.park_id = a.id) AND (c.feature_id = b.id) AND (c.feature_id
IN (21, 51, 15, 24)) GROUP BY a.id HAVING count(*)=4;
Rails:
feature_ids = [21, 51, 15, 24]
@parks = Park.all(
:joins => :features,
:group => "parks.id",
:select => "parks.*",
:conditions => ["features.id in (?)", feature_ids],
:having => "count(*)=#{feature_ids.size}"
)
The key was grouping and using having to ensure that the items each
had the max number of features that we're searching on.
I hope this helps someone.
-Kevin
On Jul 8, 2009, at 12:22 PM, Kevin Elliott wrote:
> To further show an example of trying to do this, here is a 'real
> world' example. However, in this particular case, I'm only showing
> two features, but in real usage, a user might select as many as 50
> features or so.
>
> mysql> SELECT c.park_id, a.name, c.feature_id, b.name FROM parks a,
> features b, features_parks c WHERE (c.park_id = a.id) AND
> (c.feature_id = b.id) AND (c.feature_id = 21);
> +---------+-------------------------+------------+--------+
> | park_id | name | feature_id | name |
> +---------+-------------------------+------------+--------+
> | 3120 | Mitchell Park | 21 | Swings |
> | 21385 | Boulware Park | 21 | Swings |
> | 2351 | Bowden Park | 21 | Swings |
> | 21561 | Cameron Park | 21 | Swings |
> | 24791 | Ramos Park | 21 | Swings |
> | 2585 | Eleanor Park | 21 | Swings |
> | 2627 | Flood County Park | 21 | Swings |
> | 2374 | Burgess Park | 21 | Swings |
> | 22986 | Holbrook-Palmer Park | 21 | Swings |
> | 2799 | Huddart Park | 21 | Swings |
> | 3354 | Rinconada Park | 21 | Swings |
> | 3693 | Werry Park | 21 | Swings |
> | 26152 | Willow Oaks Park | 21 | Swings |
> | 23559 | Live Oak Manor Park | 21 | Swings |
> | 24295 | Oak Meadow Park | 21 | Swings |
> | 25562 | Sunnyvale Baylands Park | 21 | Swings |
> | 21354 | Cornelis Bol Park | 21 | Swings |
> | 21459 | Bubb Park | 21 | Swings |
> | 21910 | Cooper Park | 21 | Swings |
> | 24859 | Rengstorff Park | 21 | Swings |
> | 24043 | Monroe Mini Park | 21 | Swings |
> | 24867 | Rex Manor Playground | 21 | Swings |
> +---------+-------------------------+------------+--------+
> 22 rows in set (0.00 sec)
>
>
> mysql> SELECT c.park_id, a.name, c.feature_id, b.name FROM parks a,
> features b, features_parks c WHERE (c.park_id = a.id) AND
> (c.feature_id = b.id) AND (c.feature_id = 51);
> +---------+--------------------------+------------+---------+
> | park_id | name | feature_id | name |
> +---------+--------------------------+------------+---------+
> | 3120 | Mitchell Park | 51 | Sandpit |
> | 2351 | Bowden Park | 51 | Sandpit |
> | 2374 | Burgess Park | 51 | Sandpit |
> | 2474 | Coyote Point County Park | 51 | Sandpit |
> | 24867 | Rex Manor Playground | 51 | Sandpit |
> +---------+--------------------------+------------+---------+
> 5 rows in set (0.00 sec)
>
>
> So if I was trying to fetch all parks that have BOTH 'Swings' and a
> 'Sandpit', I'd expect to see:
>
> Mitchell Park
> Bowden Park
> Burgess Park
> Rex Manor Playground
>
> If this was limited to only looking for 2 or 3 features at a time,
> even on a hefty DB of 300,000 records, this wouldn't be too
> difficult. You would fetch all parks for each of the 2 or 3
> features, then eliminate all but the ones they share in common
> (exclusionary merging of some sort). However, with 50 features, this
> becomes unusable, since you'd have tons of redundancies when
> retrieving the data, and you'd have to recursively iterate over 50
> features.
>
> Any way to combine the queries into one, and subquery it to
> eliminate ones that don't have all the features requested in the
> query?
>
> -Kevin
>
> On Jul 7, 2009, at 5:44 PM, Kevin Elliott wrote:
>
>> I have two models, associated with a HABTM (actually using
>> has_many :through on both ends, along with a join table). I need to
>> retrieve all ModelA's that is associated with BOTH of two ModelB's. I
>> do NOT want all ModelA's for ModelB_1 concatenated with all ModelA's
>> for ModelB_2. I literally want all ModelA's that are associated with
>> BOTH ModelB_1 and ModelB_2. It is not limited to only 2 ModelB's, it
>> may be up to 50 ModelB's, so this must scale.
>>
>> I can describe the problem using a variety of analogies, that I think
>> better describes my problem than the previous paragraph:
>>
>> * Find all books that were written by all 3 authors together.
>> * Find all movies that had the following 4 actors in them.
>> * Find all blog posts that belonged to BOTH the Rails and Ruby
>> categories for each post.
>> * Find all users that had all 5 of the following tags: funny,
>> thirsty, smart, thoughtful, and quick. (silly example!)
>> * Find all people that have worked in both San Francisco AND San
>> Jose
>> AND New York AND Paris in their lifetimes.
>>
>> I've thought of a variety of ways to accomplish this, but they're
>> grossly inefficient and very frowned upon.
>>
>> Taking an analogy above, say the last one, you could do something
>> like
>> query for all the people in each city, then find items in each array
>> that exist across each array. That's a minimum of 5 queries, all the
>> data of those queries transfered back to the app, then the app has to
>> intensively compare all 5 arrays to each other (loops galore!).
>> That's
>> nasty, right?
>>
>> Another possible solution would be to chain the finds on top of each
>> other, which would essentially do the same as above, but won't
>> eliminate the multiple queries and processing. Also, how would you
>> dynamicize the chain if you had user submitted checkboxes or values
>> that could be as high as 50 options? Seems dirty. You'd need a loop.
>> And again, that would intensify the search duration.
>>
>> Obviously, if possible, we'd like to have the database perform this
>> for us, so, people have suggested to me that I simply put multiple
>> conditions in. Unfortunately, you can only do an OR with HABTM
>> typically.
>>
>> Another solution I've run across is to use a search engine, like
>> sphinx or UltraSphinx. For my particular situation, I feel this is
>> overkill, and I'd rather avoid it. I still feel there should be a
>> solution that will let a user craft a query for an arbitrary number
>> of
>> ModelB's and find all ModelA's.
>>
>>
>> How would *you* solve this problem?
>>
>>
>> Thanks a bunch,
>> Kevin
>>
>> (I've subsequently cross-posted this to StackOverflow's website since
>> they have a target audience that encompasses more than Rails, and is
>> still valid in those areas ---
>> http://stackoverflow.com/questions/1095571/habtm-finds-with-and-joins-not-or
>> )
>
> >
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby
on Rails: Talk" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---