> Date: Mon, 8 May 2006 17:24:17 -0400
> From: Alex Brelsfoard 
> To: [email protected]
> Subject: [Boston.pm] LARGE MySQL Database

> Basically I am dealing with using, storing, and sorting a LOT of data in a
> mysql database.
> With all the data in the table it makes for 404.8 Million rows.  In a backup
> sql file that makes just under 80GB.
>
> I an using the InnoDB engine.
>
> I was just wondering if anyone else has had experience working with
> databases this large, and using MySQL.  I've run into some smaller
> problems along the way due to the immensity of this table.  In the
> end, to do what we want I will be creating a smaller table, with a
> subset of entries from the original.  But the original needs to
> exist as well.
>
> I'm looking for heads up warning for things I should watch out for due to
> the size of this thing, or any suggestions on speedier sorting and
> querying.

I've dealt with large volumes of data with innodb and MySQL.  Not
quite as large as yours -- maybe 200 million rows with 12 GB dump
file.

If you haven't normalized the pants out of the table, that's probably
a good place to start.  Anything you can do to trim a few bytes off
the row size will make a big difference in the size of the overall
table.  (Is this table part of a data warehouse?).  Smaller data will
always be faster than bigger data.

Of course, any structural change on a table that big won't be a quick
one.  For an internal system, that might not be a big deal.  For a
production system, it will probably be  painful.

For general server tuning, a few things to try

   - Make innodb_buffer_pool_size as big as you can, without causing
     the machine to swap.

   - if you're doing lots of writes (or your big table->table copy),
     set innodb_log_file_size to be a sizable percentage of
     innodb_buffer_pool_size; 30% or so.  That will help to speed up
     normal operations (innodb will have to spend less time flushing
     pages out do disk).  However, if you have an unclean shutdown,
     recovery will take longer.

     On my `big' innodb database, we're using two 200M innodb logs.
     That sped up bulk inserts a *lot*.

   - if some of the data is very rarely used, you might consider
     putting that portion of the data into archive tables.  The
     downside of archive tables is that they don't support indexing.
     But they're very compact (just a blob of zlib data, more or less).

Finally, if you haven't run across these

   http://dev.mysql.com/doc/refman/4.1/en/innodb-tuning.html
   http://dev.mysql.com/doc/refman/4.1/en/innodb-configuration.html

Of course, some of this depends on what the data is, and what kind of
queries you're running against it.  Measuring performance before and
after is an important step in the process.

hth

Steve
 
_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to