kimky...@fhda.edu ("Kyong Kim") writes:

> I was wondering about a scale out problem.
> Lets say you have a large table with 3 cols and 500+ million rows.
> 
> Would there be much benefit in splitting the columns into different tables
> based on INT type primary keys across the tables?

To answer your question properly requires more information:

1. Expected table structure. Can you show the current CREATE TABLE xxx\G output?
2. Expected use cases to extract data?
3. Do you expect to delete data frequently, or are you only inserting data,
   or is there a mix of inserts and deletes? If so provide more info.

I've come across situations where a large table like this caused lots
of problems. There were lots of concurrent "delete batches" (cleaning
up) and at the same time lot of inserts. At the same time there were
large groups of selects to collect certain sets of data for
presentation. Perhaps you are doing something similar? If you do
something similar you may find that it's extremely important to get
the keys right especially the primary keys so that data retrieval (for
SELECTs or DELETEs) is as fast as possible (using clustered indexes
[PRIMARY KEY in innodb]). If not or if the queries overlap you may
find performance degredation a big issue as Innobase manages the locks
to ensure that the concurrent statements don't interfere.

You can also use merge tables sitting on top of MyISAM "per year" or
"per whatever" data in each table. That avoids you having to find data
for 2009 as you look in table xxx_2009, so this can be a big
win. MyISAM has the inconvenience that if the server ever crashes
recovery of these tables can be very timeconsuming.  Innodb has a
larger footprint for the same data.

So it's hard without more information on the structure and the use
cases to answer your question. In fact if you have the time, try out
and benchmark different approaches and see which is best for your
requirements. Just remember that as the data grows the initial
measurements may not be consistent with behaviour you see later. Also
if you are looking at a large amount of data like this appropriate
server tuning can influence performance significantly.

Hope this helps.

Simon

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to