On Thu, Jun 17, 2010 at 08:44:25AM -0500, Black, Michael (IS) scratched on the wall: > Is there any advantage/disadvantage to having seperate indexes in a > case like this?
In a case like this, no. If each column was indexed individually one one of them would be used in this query. There is a (very) small performance gain in using the compound index, but you'd have to do a lot of data analysis to figure out if the additional overhead and storage from the second column is worth the gains in query speed. That's a very subjective analysis anyways. In any index placement there are advantages and disadvantages. Different indexes provide different performance boosts, and you very much need the right tool for the job. It is a hard problem. > In other words, as a simple design goal to have a separate index > that matches whatever you select on and order on? Yes and no. If the only concern is query performance, then in theory this is true. But indexes have cost associated. They add noticeable overhead to INSERT/UPDATE/DELETE, and each additional index adds additional overhead. They also add size to the database. Even if your query-to-modification ratio is very very high on the query side, adding indexes that provide very marginal benefits may still hurt you, thanks to increased I/O costs and cache crowding. So, like everything, there is a huge "it depends." The biggest thing is likely query-to-modify ratio, followed by query frequency. You don't spend a lot of time optimizing the query (or index structure) for a maintenance routine that runs once a week, especially if that modification adds 0.1% overhead to the query you run 10,000 times a day (or maybe you do, but the circumstance would be extreme). Index placement is deep magic, and is highly dependent on the database, the database product, the platform (cache memory, disk speeds), the queries, the query frequency, the data flux rate, the phase of the moon, and the alignment of the planets. It also represents this odd conflict of interest in the whole philosophy of SQL query design. In theory, SQL is all about saying what you want, and letting the database engine to all kinds of mysterious stuff to get you the results as fast as possible. In practice, it only half works that way, so there is a lot of second-guessing about where and when to place indexes. In many ways, that takes more skill and understanding than if SQL just allowed more explicit query designation. Modern query optimizers are pretty good at doing a good job with the tools at hand, but the system is not well setup to provide feedback to the developer-- "Put an index on this column and you'll boost performance 23.54%", for example. This isn't unique to SQLite, it is an issue with all SQL databases. Indexes are outside of the relational model, so they have evolved in a somewhat haphazard way, like the rest of SQL. It doesn't help that the foundation ideas were laid down when a megabyte or two of data was HUGE. It hasn't scaled so well to datasets where good indexes are a critical requirement, and not a helpful addition. > It seems to me that if you do "SELECT * FROM log WHERE > id_clie...@idclient ORDER BY utc DESC LIMIT 1" aren't you better off > with two indexes, one for id_client and one for utc? That way you > don't have to worry about order dependence at all. It seems only > if you start doing WHERE clauses on both should you need a compound index. Doesn't work that way. Once a set of rows have been extracted from an index, the data is no longer part of the original table, so a second index can't be used. You can look deeper into the same index, but not a second index. For example, I hand you two phone books--- one ordered by the standard last-name, first-name, etc., and a "reverse" phone book that is ordered by telephone number (for number => name lookups). I then ask you to provide a list of all the people with a last name of "Smith", and I want the list ordered by phone number. You can use either phonebook to get the answer, but you can't use both. You either lookup all the Smiths and manually sort that sub-set, or you scan the whole reverse book, plucking out the Smiths as you find them (and knowing the list is already in order and no further sorting is required). The only way to use an index to fill both conditions is if the phonebook is sorted by both name and then phone number... e.g. a compound index on just the right columns. As a general rule of thumb, most database systems can only use one index from one table per query (or subquery). There are a small handful of exceptions (such as chained OR conditions), but that's a good place to start. > And in this specific case unless you have a boat load of utc's for > each client_id the utc index isn't going to buy you much at all. Correct. It is doubtful the I/O overhead is worth it. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users