You are correct, that the situation you describe was not clearly presented in your previous reply. I humbly apologize for any offense.
Using the EXPLAIN you posted in your latest reply, you can translate your subquery into the following JOINed query SELECT p.offer_id FROM paytable p LEFT JOIN offers o ON o.advertiser_id = p.advertiser_id WHERE pt.login_id=1 AND o.offer_id is null; Which is not what I think you were actually trying to write. The terms selecting values from publisher_advertiser_blocks disappeared because they are on the *right* side of a LEFT JOIN and played no part in actually limiting your final results. Here is the needs statement from your first post: >>>>The goal of these queries is to select all offer_id's from `paytable` for a known login_id where that login_id is not "blocked" from that offer_id.<<<< I would write it this way CREATE TEMPORARY TABLE tmpBlocks SELECT DISTINCT o.offer_id FROM offers INNER JOIN publisher_advertiser_blocks pab ON pab.advertiser_id = o.advertiser_id AND pab.login_id = 1; ALTER TABLE tmpBlocks ADD KEY(offer_id); SELECT p.offer_id FROM paytable p LEFT JOIN tmpBlocks tb ON tb.offer_id = p.offer_id WHERE tb.offer_id IS NULL; DROP TEMPORARY TABLE tmpBlocks; One trick to working efficiently with "larger" datasets (millions of rows per table) is to minimize the number of rows being joined at one time. By breaking this query into two statements we keep our JOIN combinations to a minimum so that at each successive stage we are working with smaller sets of data than if we had written it as a single statement. The term "pab.login_id=1" is in the ON clause because your index on publisher_advertiser_block is defined in such a way that makes it better to be in the ON clause than in the WHERE clause (also information from your original post). There was a posting not long ago that said that the statistics of a temporary table's indexes were not updated if they exist before you fill the table with data. That reason and the fact that it's faster to build an index on a populated table than populate an indexed table is why I added the index after filling the table with data. Again, I apologize for any offense I may have caused, Shawn Green Database Administrator Unimin Corporation - Spruce Pine Devananda <[EMAIL PROTECTED]> wrote on 02/01/2006 02:17:36 PM: > [EMAIL PROTECTED] wrote: > > Devananda, > > > > I have to support Peter on this one. What he submitted to you is a > > perfectly appropriate solution. It seems as though you rejected his > > assistance before even trying to see if it would work. > On the contrary, I have tried his suggestions and they do not work (see > the email I just sent to the list). I have also tried, but obviously not > done very well, to explain why they will not work in this particular > case. The queries Peter has suggested actually take many minutes to > return when run on the real data, whereas my initial subquery takes only > a few seconds. I had initially hoped to find a way, using a JOIN, that > would take less than a second. > > > d1 mysql> explain SELECT DISTINCT pt.offer_id FROM paytable AS pt INNER > JOIN offers AS o USING (offer_id) LEFT JOIN publisher_advertiser_blocks > AS pab1 ON pt.login_id=pab1.login_id LEFT JOIN > publisher_advertiser_blocks AS pab2 ON > o.advertiser_id=pab2.advertiser_id WHERE pt.login_id=1 AND > pab1.login_id IS NULL OR pab2.advertiser_id IS NULL; > +----+-------------+-------+--------+---------------+--------- > +---------+-----------------------+--------- > +------------------------------------+ > | id | select_type | table | type | possible_keys | key | key_len > | ref | rows | Extra | > +----+-------------+-------+--------+---------------+--------- > +---------+-----------------------+--------- > +------------------------------------+ > | 1 | SIMPLE | pt | index | PRIMARY | PRIMARY | 5 > | NULL | 1814973 | Using index; Using temporary | > | 1 | SIMPLE | o | eq_ref | PRIMARY | PRIMARY | 4 > | affiliate.pt.offer_id | 1 | Using where; Distinct | > | 1 | SIMPLE | pab1 | ref | PRIMARY | PRIMARY | 4 > | affiliate.pt.login_id | 7 | Using index; Distinct | > | 1 | SIMPLE | pab2 | index | NULL | PRIMARY | 8 > | NULL | 62 | Using where; Using index; Distinct | > +----+-------------+-------+--------+---------------+--------- > +---------+-----------------------+--------- > +------------------------------------+ > > > > d1 mysql> explain SELECT offer_id FROM paytable WHERE login_id=1 AND > offer_id NOT IN ( SELECT offer_id FROM offers LEFT JOIN > publisher_advertiser_blocks AS pab USING (advertiser_id) WHERE > pab.login_id=1 ); > +----+--------------------+----------+-------- > +-----------------------+---------+--------- > +--------------------------------------+---------+--------------------------+ > | id | select_type | table | type | possible_keys | > key | key_len | ref | rows | > Extra | > +----+--------------------+----------+-------- > +-----------------------+---------+--------- > +--------------------------------------+---------+--------------------------+ > | 1 | PRIMARY | paytable | index | NULL | > PRIMARY | 5 | NULL | 1814973 | > Using where; Using index | > | 2 | DEPENDENT SUBQUERY | offers | eq_ref | PRIMARY,advertiser_id | > PRIMARY | 4 | func | 1 | > Using where | > | 2 | DEPENDENT SUBQUERY | pab | eq_ref | PRIMARY | > PRIMARY | 8 | const,affiliate.offers.advertiser_id | 1 | > Using where; Using index | > +----+--------------------+----------+-------- > +-----------------------+---------+--------- > +--------------------------------------+---------+--------------------------+ > 3 rows in set (0.03 sec) > > > There have been and > > continue to be SQL-driven databases around that have not had and do not > > have subquery support. They function perfectly well without them. By your > > response, it appears to me you don't know how a LEFT JOIN is supposed to > > operate. That is one of the most basic tools of any relational DBA's > > arsenal. > > > Perhaps I didn't phrase my response well, regarding 1to1 relationship, > but I do understand how to use a LEFT JOIN. I freely admit I don't have > as much experience with all the types of joins, or in other DBMS, as > many of the folks on this list. Also, I _know_ that subqueries and IN > lists are not nearly as efficient as JOINs. That is exactly _why_ I > asked in the first place!! > > :) > > Regards, > Devananda > > > > I implore everyone not comfortable with them to learn their "JOIN"s (INNER > > JOIN, LEFT JOIN, RIGHT JOIN, CROSS JOIN, etc.) before you learn how to > > subquery (and I do NOT mean that comma-separated crap that poses as a > > CROSS JOIN). So many of the performance problems posted to this list are > > subquery-related when a simple, straightforward JOIN or two will solve the > > same problem usually with _much_ better response time. This post is a > > perfect example. > > > > Explicit JOIN statements have the advantage of possibly using indexes at > > every layer of data aggregation. Subqueries lose that ability at the level > > of each subquery. Some subqueries, depending on how they are written, are > > executed once PER ROW of your parent query data. If it takes .01 seconds > > to run a subquery by itself and you have 10000 rows of parent data, that > > would be 100 seconds of nothing but subquery execution time before you > > even get to the GROUP BY or ORDER BY processing part of the query. > > Rewriting the same query to use an explicit JOIN may turn the .01 seconds > > of the "naked" subquery into .02-.04 seconds because of the indexing > > advantage and because it computes a set of matches between the table only > > once. > > > > Subqueries have their uses and are appropriate for many situations. > > However, IMHO, they should be your second-line of attack because of their > > inherent weaknesses. I believe this to be true for the majority of the > > RDBM systems in the world today. > > > > Shawn Green > > Database Administrator > > Unimin Corporation - Spruce Pine > > > > > > -- > Devananda vdv > > > http://devananda-vdv.blogspot.com/ > http://mycat.sourceforge.net/ >