I have a cgi-bin script which generates SQL expressions based on some user
input.  I noticed yesterday that for certain input, the script would
return wildly incorrect results from the database.  The problem was fixed
by adding parentheses around part of the Boolean expression in the WHERE
clause, but I can't figure out why I was getting the results I was getting
before.  Any takers?

Here is the setup:

  Table acts
  Fields:    resolution VARCHAR(10) NOT NULL PRIMARY KEY,
             title TEXT NOT NULL,
             author VARCHAR(255),
             action VARCHAR(255), etc. (fields which don't matter)
  

  Table acts_text
  Fields:    resolution VARCHAR(10) NOT NULL PRIMARY KEY,
             words MEDIUMTEXT


When I execute the following query:

    select acts.resolution, title from acts, acts_text
    where acts.resolution=acts_text.resolution and 
    words like "% china %" or words like "% japan %"; 

I get what appears to be all the rows in the acts table repeated by a
factor of 10; i.e. there are 2000 rows in the table, and I'm getting a
result containing 20,000 rows.


However, when I change the query slightly by putting () around the or
(which is what I intended anyway) as in

    select acts.resolution, title from acts, acts_text
    where acts.resolution=acts_text.resolution and 
    (words like "% china %" or words like "% japan %"); 


I get exactly the correct results.  I understand that without the () the
Boolean isn't being processed the way I intended, but what I don't
understand is why every row in the table is being returned 10 times over,
especially since  most of these rows don't contain the words china or
japan.

Adding another AND clause produces exactly the same results:

    select acts.resolution, title from acts, acts_text
    where acts.resolution=acts_text.resolution and 
    acts.resolution like "1988% " and
    words like "% china %" or words like "% japan %"; 

produces 10 times the number of rows in the table, whereas

    select acts.resolution, title from acts, acts_text
    where acts.resolution=acts_text.resolution and 
    acts.resolution like "1988% " and
    (words like "% china %" or words like "% japan %"); 


produces exactly the correct results.  Anyone?

Also, will I realize a performance boost by reorganizing the latter query
as follows or will the parser do this for me automatically?

    select acts.resolution, title from acts, acts_text
    where acts.resolution like "1988% " and
    acts.resolution=acts_text.resolution and 
    (words like "% china %" or words like "% japan %"); 





---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to