> On Sunday, 25 November, 2012, 11:58, Jay A. Kreibich wrote:

>   each column is usually undesirable.  A given SELECT can usually only
>   use one index per query (or sub-query), so it rarely makes sense to
>   stack up the indexes... adding unused indexes only slows down
>   insert/update/deletes, as well as makes the file size much larger.

Generally speaking indexes should be treated as a performance trade-off.  When 
you create an index (other than a UNIQUE index used to enforce a constraint, or 
an index on a parent or child key in a foreign-key relationship, where such an 
index may greatly increase INSERT or UPDATE performance) you are "moving" 
execution time from the retrieval processing to the maintenance processing of 
your application.

When you add an index, you are (usually) optimizing retrieval and query 
operations.  The execution time saved during such query operations does not 
disappear (it is not recovered).  These processor cycles and I/O operations are 
"removed" from retrieval operations and "spent" when you perform updates to the 
database to maintain the indexes.  In other words, optimization does not make 
all operations faster -- the time required for your application to perform its 
functions can be viewed as a fixed size bag of water.  When you optimize some 
operation by adding an index you are "pushing in" the bag of water in some 
specific place.  It bulges out somewhere else, and some other operation becomes 
slower because now you have to perform additional operations elsewhere to 
maintain the indexes.

The objective is to save a "huge" amount of processing time in one operation 
and spend that savings in "wee bits" distributed over many other operations 
where it is not noticeable.  For example, adding a particular index may 
decrease the time to perform a certain query from 5 minutes to 5 seconds.  
However, it will add a few hundred milliseconds to each update or insert 
operation.  If this trade-off is acceptable, then you have made a good 
trade-off.  On the other hand, if the indexes you added to optimize the query 
increases each update/insert by a few seconds, the trade-off may not be 
acceptable.  You may need to find an acceptable middle ground ...

>   Additionally, indexing any column (even one used heavily in a query
>   filter) is not useful if the index doesn't reduce the working data
>   set to ~10% of the rows.  In other words, having an index on a column
>   that has all one value (be it NULL or anything else) accomplishes
>   nothing but significantly slowing down queries that use the index.
>   Same is true, for example, of a True/False column with an even
>   distribution.  Generally, unless the index can be used to eliminate
>   the vast majority of rows, it will slow down, rather than speed up, a
>   query.

The exception to this is, of course, where the index created is a covering 
index because using a covering index, while it may not necessarily reduce the 
number of rows significantly, eliminates the accesses to the base table to 
retrieve data that might be being used in further operations.

Care needs to be taken to not prematurely add indexes that will add maintenance 
cost but not significantly improve query performance (ie, be careful not to 
just add water into the bag -- the objective is to poke it around, not just add 
more water).  Adding indexes required to enforce or optimize constraint and 
foreign key operations are almost always required -- but other indexes should 
not be added until you are sure that they will actually be required because 
overuse of indexes can severely hamper overall performance.

---
()  ascii ribbon campaign against html e-mail
/\  www.asciiribbon.org



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to