On Jul 23, 2007, at 3:45 PM, David Krings wrote:
So, I guess now that I got this settled I better learn what a left join is in case someone asks me to explain my code. :)

LEFT JOIN actually is short for LEFT OUTER JOIN. OUTER JOIN means that it doesn't require finding a match... if it doesn't, it returns null. LEFT OUTER JOIN means that the table on the left side of the join tries to join to the table on the right.

So... for an example, let's say a "user" table has an id and a "session" table also holds a reference to the user id.

SELECT u.id, s.id FROM user u LEFT JOIN session s ON u.id = s.user_id WHERE u.id = 5

If user id = 5 exists in the user table but there aren't any records with user_id = 5 in the session table, the result would be:
u.id            s.id
5               null

If you did an INNER JOIN:
SELECT u.id, s.id FROM user u INNER JOIN session s ON u.id = s.user_id WHERE u.id = 5

This would return an empty set because the INNER JOIN requires that it finds rows in both tables. So in your case, I actually think you could have done INNER JOINs - but since it's not broken, no need to fix, right :)

When you right a query like this:
SELECT u.id, s.id FROM user u, session s WHERE u.id = 5 AND u.id = s.user_id

You are doing a cartesian join - the commas imply the join. Usually, this works pretty much the same as an inner join but the difference between a cartesian join and an inner join it that the database tries to figure out all of the ways that the tables can be joined -- rather than just using what you specified in your ON clause. If you go back to your original query, you're joining everything off of the cupssbmain tables (based on the links to the CourseID in your where clause). The database tries to figure out all the different ways to connect the tables based off this and this is how it wound up with the 1248 rows. Also, the reason the SELECT DISTINCT worked when you were just getting the sbid was that it was going through all of those rows and eliminating the ones that returned the same sbid. This is why people say that DISTINCT is bad -- because it's a lot of extra work for the database to get this huge number of rows and then go and figure out which results aren't duplicates. If you take out the DISTINCT part of the query, you can see how many rows are actually getting returned and see what the database has to search through. When you added the position from the other tables, it then returned 1248 because that was all the different combinations of sbid, module, lesson, and page positions.

Going back to that original query, you can write it like this:

SELECT cupssbmain.StoryboardID AS sbid
FROM cupsmodules,  cupslessons, cupspages, cupssbmain
WHERE cupsmodules.CourseID = 23
  AND cupslessons.ModuleID = cupsmodules.ModuleID
  AND cupspages.LessonID = cupslessons.LessonID
  AND cupssbmain.PageID = cupspages.PageID
ORDER BY cupsmodules.Module_Position ASC,
  cupslessons.Lesson_Position ASC,
  cupspages.Page_Position ASC

I think that should give you the right result. So... it is possible without using INNER JOIN or LEFT JOIN. We had a discussion last year on NYPHP-MySQL about INNER JOIN versus commas (cartesian join) -- you so often see it with commas in tutorials and other people's code. But the concensus we came to was that it's best to be as specific as possible... so I've always used INNER JOINs instead of commas ever since.

Later,
Rob


_______________________________________________
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk

NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com

Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php

Reply via email to