Hi Justin, > I'm after a little help/advice with indexing... > I have a table for measuring hits on each page within a site: > year month page hits > > basically, my php script looks for a row matching this year (2002), this > month (12) and this page (something.php). if it finds it, increment by 1, > otherwise insert a row. > > All is working fine. > However, I'm not sure what column(s) would be best for indexing... this > small PHP script gets called on every hit of every page (with one or two > queries), so I'd like to optimise it in any way I can, finding a balance > between > a) quickly finding out if the row already exists > b) quickly inserting the row if it doesn't > I'm not worried about how long reporting the stats takes. > Or am I worried about nothing?
The purpose of an index is to speed up... because the hit-count query is only looking for one page-addr out of a collection, indexing that column will speed up both retrieval and UPDATE. An index will slow down certain transactions, eg INSERT, because not only does data have to be added to the tbl, but an new index entry must be made as well. I can't imagine that this will represent a significant proportion of the transactions you describe. For reporting you could create a year/month index, but will it make much difference/how much data are we talking about? If you have it, the only time the hit-count will take a performance hit is the first time a page is accessed each month - so I'll assume a non-issue. On the other hand... why keep all this data in one table. In the 'good old days' we not only kept transaction data separated from 'old' records, but might even keep them in separate DBMSes - for exactly the considerations you are discussing. So, any data from Nov 2002 backwards has no place in the hit-count recording process. It could be removed to a separate table. Another view: do you need to provide stats that are up-to-the-minute/second, or would data accurate to midnight-last-night be acceptable? Now you could take a 'snapshot' every ?midnight and run analyses against the snapshot, whilst the hit-count continues unabated and unaffected. If the tables were separate, you could go for minimal indexing of the transaction system, and index everything under the sun in the statistics table! You've read the manual, I assume. The CREATE INDEX and ALTER entries provide better coverage than CREATE TABLE - follow a link or two from there. These talk of index 'advantages' in and amongst the 'methods'. Somewhere there is a discussion of index 'costs', but I've not found it in a quick perusal - definitely covered in DuBois/MySQL. Enough for round one? =dn > sql,query --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php