Malcolm asked, how to find the classes that were attended by all of the
students from a given list.
http://www.pointy-stick.com/blog/2006/06/12/sql-puzzle/

Then, he proposed a solution with HAVING clause:
http://www.pointy-stick.com/blog/2006/06/13/sql-puzzle-solution/

I'd like to submit another SQL solution.

The database structure was:

CREATE TABLE Class (
    id  integer NOT NULL PRIMARY KEY
);

CREATE TABLE Student (
    id integer NOT NULL PRIMARY KEY
);

CREATE TABLE Reln_Class_Student (
    class_id integer NOT NULL REFERENCES Class(id),
    student_id integer NOT NULL REFERENCES Student(id),
    PRIMARY KEY (class_id, student_id)
);

Like in Malcolm's example, following example finds the classes that are
attended by both 253 and 289 students.

SELECT id
FROM
    Class AS C
    INNER JOIN Reln_Class_Student AS s253
        ON (C.id = s253.class_id)
    INNER JOIN Reln_Class_Student AS s289
        ON (C.id = s289.class_id)
WHERE
    s253.student_id = 253
    AND
    s289.student_id = 289
;

In my example, the number of students on the list (here: two) is equal
to number of joins that are to be performed.

Each join must have different alias assigned, hence "s253" and "s289"
aliases for joined tables.

I don't know about how well does it scale with longer students list.
With just two students on the search list, with total of 2000 students,
200 classes and about 70 thousands records in the Reln_Class_Student
table, the query performed as follows:

 Nested Loop  (cost=0.00..1318.12 rows=16 width=4) (actual time=0.994..11.487 
rows=15 loops=1)
   ->  Nested Loop  (cost=0.00..1103.88 rows=39 width=8) (actual 
time=0.410..10.610 rows=36 loops=1)
         ->  Index Scan using class_pkey on "class" c  (cost=0.00..5.20 
rows=200 width=4) (actual time=0.099..3.914 rows=200 loops=1)
         ->  Index Scan using reln_class_student_pkey on reln_class_student 
s289  (cost=0.00..5.48 rows=1 width=4) (actual time=0.025..0.026 rows=0 
loops=200)
               Index Cond: (("outer".id = s289.class_id) AND (s289.student_id = 
289))
   ->  Index Scan using reln_class_student_pkey on reln_class_student s253  
(cost=0.00..5.48 rows=1 width=4) (actual time=0.012..0.013 rows=0 loops=36)     
    Index Cond: (("outer".class_id = s253.class_id) AND (s253.student_id = 253))
 Total runtime: 11.634 ms

The same query with list of 10 students and 10 joins executed in 5.8ms,
so I guess there's no problem with scalability of my solution. At least
with PostgreSQL.

I can imagine this solution applied to Django with something like:

Group.objects.filter(students__id = 253).filter(students__id = 289)

but currently, it would return zero groups as it would try to find
students with id equal to both 253 and 289 at once. In other words,
Django wouldn't perform two joins. It would add two WHERE statements to
one JOIN.

My knowledge about Django ends here. Maybe Django gurus will think of
a way to make Django perform two (and more) aliased joins with the same
table.

-- 
Maciej Bliziński <[EMAIL PROTECTED]>
http://automatthias.wordpress.com


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~----------~----~----~----~------~----~------~--~---

Reply via email to