Yves,

> My problem is that the sub-select in line 7
> ("SELECT 1") takes a rather long time

It might be possible to simplify. Do I have the schema right?

message (messageID)
keylist (keylistID)
tag ( tagID, readaccesskeylist references keylist(keylistID) )
message_revision_tag ( ???, messageID references message(messageID), tagID references tag(tagID))

Or is there a message_revisions table missing from your description? On what I have, if the condition tag.readaccesskeylist IS NOT NULL defines a 'banned' condition, perhaps the query can be written without referencing the keylist table, since the requirement simply translates to finding all messages with no tag having a non-null readaccesskeylist, i.e. ...
(i} find the messages which have a non-null tag.readaccesskeylist,
(ii) find the messages which are not in [i].

(i) Finding messages which have a deny-access tag looks like a simple join:

SELECT DISTINCT messageID
FROM message_revision_tag AS mrt
JOIN tag AS t ON mrt.tagID=t.tagID
WHERE t.readaccesskeylist IS NOT NULL;

(ii) We get the messages not in the above result with a simple exclusion join:

SELECT messageID
FROM message m
LEFT JOIN (
 SELECT DISTINCT messageID
 FROM message_revision_tag AS mrt
 JOIN tag AS t ON mrt.tagID=t.tagID
 WHERE t.readaccesskeylist IS NOT NULL
) AS banned USING (messageID)
WHERE banned.messageID IS NULL,

Or did I miss something?

PB
http://www.artfulsoftware.com

-----

Yves Goergen wrote:
Hi,

I have a performance problem with one of my SQL queries. It's a rather complex one so I'll spare you the details. This is the situation:

In my system, there's messages, tags and keylists. Each message has message_revisions, each message_revision can be assigned tags (stored in message_revision_tag). Each tag points to a keylist that contains all keys that grant access to messages with that tag. If a tag has no keylist assigned (ReadAccessKeylist IS NULL), then everybody may access the messages.

This query finds all messages that don't have a tag assigned that would deny access to it. (Assume every message has only a single revision with the number 1, for now. The actual user comparison with another sub-select is hidden in <some more>.)

SELECT m."MessageId"
FROM "message" m
WHERE
  NOT EXISTS
    (SELECT
      EXISTS
        (SELECT 1
        FROM "keylist" tk
        WHERE tk."KeylistId" = t."ReadAccessKeylist" AND <some more>)
      AS "Allowed"
    FROM "message_revision_tag" mrt
      JOIN "tag" t USING ("TagId")
    WHERE mrt."MessageId" = m."MessageId" AND
      mrt."RevisionNumber" = 1 AND
      t."ReadAccessKeylist" IS NOT NULL
    HAVING NOT "Allowed")

My problem is that the sub-select in line 7 ("SELECT 1") takes a rather long time. (When I remove it, it's much faster.) I'm not sure why, because there's not a single keylist in that table, however. Another issue is that this query should actually never be regarded. The condition in the second-last line is always false. A simple test confirms that:

SELECT COUNT(*) FROM "tag" WHERE "ReadAccessKeylist" IS NOT NULL
-> 0

So there should never be a reason why the FROM in line 11 would result in a row (filtering out with the conditions, of course). But it still gets executed. When I make sure that the condition is always false, by adding a "AND 0" just before the HAVING clause, the whole thing runs much faster.

I have no separate timings, but it's in the magnitude of 1 vs. 5-10 milliseconds for the query. Run a hundred times makes a noticeable delay.

My understanding of it all was that first the FROM clause is regarded to see what rows there are. Then WHERE filters them, then SELECT will pick some columns (and thereby execute my sub-select expression) and finally HAVING filters again. Since in my theory there is no single row, SELECT has nothing to do. But obviously it has.

Some suggestion what's going on?


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to