On 4/23/06, Nick Hill <[EMAIL PROTECTED]> wrote: > 1) I can improve performance by a factor of 2-2.5 by changing the double > lat/lon to an integer then selecting on an integer. > > 2) I have concluded that for each 10 fold increase in the number of > records, select queries take twice as long. For each doubling of the > number of returned records, there is a sqrt(2) increase in select query > time.
I've noticed a couple things. 1) Right now you're emulating spatial index. 2) In future, you're going to emulate partitioning. Why do you think that doing this stuff manually is better than using builtin capabilities? > As the database grows, it would likely improve database performance by > splitting an individual table into several thousand tables using the > file system directory btree algorithm to effectively pre-select the data > before the query is handled to the MySQL engine. This is not a neat > solution. A much better way would be to improve the mysql index > performance on very large numbers of records. Selects against a table use b-trees too. Splitting data into lot of tables won't help with selects at all (well, it may help on scans with concurrent large data sets if data will be spread across different physical drives, but not with regular range lookups that you're doing). It will only help with inserts. > Given that there is such a strong relationship between the number of > records returned, and query time, I conclude that the whole index tree > is matched for every given number of root x records returned. If all > records we are matching are under a single node or under a small number > of nodes in the index tree, perhaps there is some way of telling the > database engine to ignore the rest of the index tree. What is a 'root record'? Are you speaking about internal representation of b-tree? > Could this work, or am I misunderstanding how the index tree works? Are > there existing optimisations which can de-couple the relationship > between number of records and query time where the records I am > selecting are within a small range? For studying select query performance issues it's better think about index as simply about a sorted array with random-access, where each random access costs O(lgN) and accesses to adjanced data cost O(1). If your points are spread uniformly in space, cost of select query you've shown is O(N*lgN) -- Alexey Polyakov -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]