A good decision on wether to use multi-column indexes will take into account the cardinality of the data. Sometimes a multi-column index is simply the right answer, particularly if the values in the first column of the index do not limit the rows sufficiently. Also in where exists clauses the rdbms will not even load the row if the primary key is the composite index, but it will have to. Indexes are best thought of like caches, You use the right caching mechanism for the job, and you only do it when you can demonstrate performance improvement.

The mysql reference is true but not a matter of concern in a large number of cases when composite keys are being used. Analyising complex queries is a good idea, but hunting out and eliminating multi-column indexes doesn't seem helpful.

On Aug 26, 2009, at 6:40 AM, Jean-Michel Pouré wrote:

Le mercredi 26 août 2009 à 08:30 -0400, Jamie Holly a écrit :
When writing queries they should be targeted for performance on all
supported RDBMS, yet the emphasis should be put on MySQL. The simple
fact that a vast (vast) majority of Drupal sites out there run MySQL
is
reason enough for that, and until there is some major cosmic shift in
what hosting companies offer that's going to remain the fact.

Dear Jamie,

I have been rewriting a generic guide for both MySQL and PostrgeSQL:
Guidelines for writing efficient SQL code http://drupal.org/node/ 559302

These are very generic guidelines explaining why you should always log
server-side queries, use Devel module and ANALYSE your queries.

I also explain the notion of sequential scan and CPU time needed.

Then I analyse simple facts using EXPLAIN.


Also on indexes you say "Therefore the proposed solution to enhance
Drupal is to hunt for multi-column indexes and replace them with
single
indexes.". MySQL makes very effecient use of multi column indexes. On
countless queries in Drupal changing to only single column indexes
would
result in full table scans, temporary tables and filesorts in
numerous
places.

About dual-field indexing: PostgreSQL and MySQL work the same way.

Some databases claim that a dual index works equally right and left.
This is not portable and probably not true, as multi-column indexing may
result in a sequential scan in complex queries:

Reference:
According to

    http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html:

MySQL cannot use an index if the columns do not form a leftmost
        prefix of the index.

    http://www.postgresql.org/docs/8.4/static/indexes-multicolumn.html

        index could in principle be used for queries that have
        constraints on b and/or c with no constraint on a — but the
        entire index would have to be scanned, so in most cases the
        planner would prefer a sequential table scan over using the
        index.

Therefore, when needed, it could be needed to add single indexes AND
keep dual indexes.


Do LEFT JOINS outperform INNER JOINS? Yes
Do INNER JOINS outperform LEFT JOINS? Yes

I don't believe so because of the nature of INNER JOINs which remove
NULL rows and thus consume more CPU time. They only work faster on
indexes on dual-fields when used properly.

This needs to be studied using ANALYSE on very large SQL servers, not
simple installs. Drupal website database probably is a good testing
system.

I hope that you can comment my code and improve the English and
technical aspects.

Kind regards,
Jean-Michel




Reply via email to