Sure!

I've narrowed down the problem slightly by eliminating all parts of my 
index which are unnecessary. I'll start with the parts of the datamodel I 
suppose are relevant:

Doctor
   has_many :doctors_practices, :inverse_of => :doctor
   has_many :practices, :through => :doctors_practices
   has_many :counties,   :through => :doctors_practices

* I've got an explicit doctors_practices relation to be able to access 
various fields on the relation.
* The doctors_practices table contain the county id for convenience, to 
minimize the number of joins necessary to get to the counties a doctor has 
practices in.

Perhaps this design is just stupid and I should just be stopped right here. 
But if not, here goes...

If I set up the index in Doctor like this:

    indexes [lastname, firstname], :as => :name, :sortable => true
    indexes practices.name,     :as => :practice_name,      :sortable => 
true

And rebuild the index, the SQL of my doctor_core_0 index is something like 
this:

CAST(COALESCE("doctors"."lastname"::varchar, '') as varchar) || ' ' || 
CAST(COALESCE("doctors"."firstname"::varchar, '') as varchar) AS "name",
array_to_string(array_agg(COALESCE("practices"."name", '0')), ' ') AS 
"practice_name",
FROM "doctors"
LEFT OUTER JOIN "doctors_practices" ON "doctors_practices"."doctor_id" = 
"doctors"."id"
LEFT OUTER JOIN "practices" ON "practices"."id" = 
"doctors_practices"."practice_id"
GROUP BY "doctors"."id",
         "doctors"."lastname",
         "doctors"."firstname",
         "doctors"."id",
         "doctors"."lastname",
         "doctors"."firstname"


Which works just fine. If a doctor is connected to one practice, a single 
practice name is returned. If connected to multiple practices, each 
practice name is displayed only once.

But once I include an index on another relation through my join table, for 
example county, so my index looks something like:

    indexes [lastname, firstname], :as => :name, :sortable => true
    indexes practices.name,     :as => :practice_name,      :sortable => 
true
    indexes counties.name,       :as => :county_name,       :sortable => 
true

Then the resulting joins in the doctor_core_0 index start playing tricks on 
me. SQL is as follows:

SELECT
CAST(COALESCE("doctors"."lastname"::varchar, '') as varchar) || ' ' || 
CAST(COALESCE("doctors"."firstname"::varchar, '') as varchar) AS "name",
array_to_string(array_agg(COALESCE("practices"."name", '0')), ' ') AS 
"practice_name"
array_to_string(array_agg(COALESCE("counties"."name", '0')), ' ') AS 
"county_name"
FROM "doctors"
LEFT OUTER JOIN "doctors_practices" ON "doctors_practices"."doctor_id" = 
"doctors"."id"
LEFT OUTER JOIN "practices" ON "practices"."id" = 
"doctors_practices"."practice_id"
LEFT OUTER JOIN "doctors_practices" "doctors_practices_doctors_join" ON 
"doctors_practices_doctors_join"."doctor_id" = "doctors"."id"
LEFT OUTER JOIN "counties" ON "counties"."id" = 
"doctors_practices_doctors_join"."county_id"
GROUP BY "doctors"."id",
         "doctors"."lastname",
         "doctors"."firstname",
         "doctors"."id",
         "doctors"."lastname",
         "doctors"."firstname"

Now, if doctors are connected to multiple practices, all practice names and 
county names are repeated twice. See?

Not being very proficient in neither thinking sphinx or rails I'm having 
trouble seeing how to fix this. Perhaps you see a sensible solution? Or 
perhaps I need to rethink my design.. or both. :-)

Cheers,
Roger



kl. 22:52:28 UTC+2 mandag 30. juli 2012 skrev Pat Allan følgende:
>
> Hi Roger
>
> Not sure exactly what you're referring to - can you provide the SQL 
> statement and point out what you think could be more efficient?
>
> Cheers
>
> -- 
> Pat
>
> On 29/07/2012, at 5:06 PM, Roger Kind Kristiansen wrote:
>
> > Hi again,
> > 
> > I just noticed that the SQL generated to my sphinx config returns a bit 
> more than I expected. I'll not dig to deep into my data model or concrete 
> indices first, as I assume this is yet another silly newbie mistake which 
> is easy to pinpoint.
> > 
> > I've got one model which has a few associations (most are through a join 
> table, but not all). I've set up some indices and some attributes making 
> use of these associations. Now when I manually run the SQL generated by TS, 
> I notice the same values from the associated models are repeated multiple 
> times in each column. Is this expected behaviour, or is there some obvious 
> thing I'm missing which TS doesn't do for me automatically, like some 
> explicit grouping or unique constraints or something?
> > 
> > This hasn't caused me any trouble up until now, but now I'm trying to 
> sort by counting the number of entries in one of the associated models and 
> the numbers get all screwy. As far as I can understand it's related to the 
> mentioned duplication.
> > 
> > Cheers,
> > Roger
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "Thinking Sphinx" group.
> > To view this discussion on the web visit 
> https://groups.google.com/d/msg/thinking-sphinx/-/C_gW49zbm6kJ.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to 
> [email protected].
> > For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/thinking-sphinx/-/j0ML9ULNFaoJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

Reply via email to