[EMAIL PROTECTED] wrote:
Yes, a ranged query should respond faster than a negation. In some cases you can seriously improve query performance for a negation query if you split it into two range queries unioned together. Here is a pseudo example:

This query should be slow due to the table scan it takes to test the condition:

SELECT ...
FROM ...
WHERE indexfield <> 16

This query will use the index if the number of rows with indexfield != 16 is small enough (< about 30%).

This query should be noticeably faster due to using ranged queries (partial index ranged matches):

(       SELECT ...
FROM ...
WHERE indexfield < 16
) UNION (
        SELECT ...
FROM ...
WHERE indexfield > 16
)

At the very least, you'll want to make that UNION ALL, rather than just UNION, so mysql doesn't waste time looking for duplicates to remove. But even then, this may be slower.

First, these two unioned queries may still not use the index. If the number of rows with indexfield < 16 is too large (more than about 30%) the first query will be a table scan. If the number of rows with indexfield > 16 is too large (more than about 30%) the second query will be a table scan. In fact, if the number of rows with indexfield = 16 is less than about 40% of the table, then at least one of the two unioned queries is guaranteed to be a table scan. Worse yet, this query stands a good chance of being twice as long as the single, != query, because it may require 2 table scans!

Second, even if both unioned queries use the index, the result still may not be faster. If the combined number of matching rows is too high, the full table scan should be faster than the indexed lookups.

For example:

  SELECT cat, COUNT(*) FROM inits GROUP BY cat;
  +------+----------+
  | cat  | COUNT(*) |
  +------+----------+
  |    0 |     5743 |
  |    1 |     3792 |
  |    2 |    30727 |
  |    3 |     1926 |
  |    4 |     7812 |
  +------+----------+

19273 rows (38.55%) match cat != 2, with roughly half (~19%) on either side.

First, the != case:

  EXPLAIN SELECT * FROM inits WHERE cat != 2 \G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: inits
           type: ALL
  possible_keys: cat_idx
            key: NULL
        key_len: NULL
            ref: NULL
           rows: 50000
          Extra: Using where

As expected, a full table scan.

  SELECT * FROM inits WHERE cat != 2;
  ...
  19273 rows in set (0.37 sec)

Now, the unioned range queries:

  EXPLAIN SELECT * FROM inits WHERE cat < 2
          UNION ALL
          SELECT * FROM inits WHERE cat > 2 \G
  *************************** 1. row ***************************
             id: 1
    select_type: PRIMARY
          table: inits
           type: range
  possible_keys: cat_idx
            key: cat_idx
        key_len: 5
            ref: NULL
           rows: 5680
          Extra: Using where
  *************************** 2. row ***************************
             id: 2
    select_type: UNION
          table: inits
           type: range
  possible_keys: cat_idx
            key: cat_idx
        key_len: 5
            ref: NULL
           rows: 6543
          Extra: Using where
  *************************** 3. row ***************************
             id: NULL
    select_type: UNION RESULT
          table: <union1,2>
           type: ALL
  possible_keys: NULL
            key: NULL
        key_len: NULL
            ref: NULL
           rows: NULL
          Extra:


As hoped, mysql plans to use the index for each query.

  SELECT * FROM inits WHERE cat < 2
  UNION ALL
  SELECT * FROM inits WHERE cat > 2;
  ...
  19273 rows in set (0.78 sec)

Despite (because of) using the index, this takes more than twice as long!

Of course, if you have to do a table scan ANYWAY (because a value you have in a constraint is not in an index) then this won't help. This optimization is all about getting the engine to use an index whenever it can instead of performing a table scan. Of course, that is not always possible.

Even when using an index is possible, it is not always desirable. I'd suggest not trying to outfox the optimizer until you've first determined it is making bad choices, and then test to make sure the solution is actually an improvement.

Michael

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to