kimky...@fhda.edu ("Kyong Kim") writes: > I was wondering about a scale out problem. > Lets say you have a large table with 3 cols and 500+ million rows. > > Would there be much benefit in splitting the columns into different tables > based on INT type primary keys across the tables?
To answer your question properly requires more information: 1. Expected table structure. Can you show the current CREATE TABLE xxx\G output? 2. Expected use cases to extract data? 3. Do you expect to delete data frequently, or are you only inserting data, or is there a mix of inserts and deletes? If so provide more info. I've come across situations where a large table like this caused lots of problems. There were lots of concurrent "delete batches" (cleaning up) and at the same time lot of inserts. At the same time there were large groups of selects to collect certain sets of data for presentation. Perhaps you are doing something similar? If you do something similar you may find that it's extremely important to get the keys right especially the primary keys so that data retrieval (for SELECTs or DELETEs) is as fast as possible (using clustered indexes [PRIMARY KEY in innodb]). If not or if the queries overlap you may find performance degredation a big issue as Innobase manages the locks to ensure that the concurrent statements don't interfere. You can also use merge tables sitting on top of MyISAM "per year" or "per whatever" data in each table. That avoids you having to find data for 2009 as you look in table xxx_2009, so this can be a big win. MyISAM has the inconvenience that if the server ever crashes recovery of these tables can be very timeconsuming. Innodb has a larger footprint for the same data. So it's hard without more information on the structure and the use cases to answer your question. In fact if you have the time, try out and benchmark different approaches and see which is best for your requirements. Just remember that as the data grows the initial measurements may not be consistent with behaviour you see later. Also if you are looking at a large amount of data like this appropriate server tuning can influence performance significantly. Hope this helps. Simon -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org