when reimporting or reinserting or whatever from a huge db i usually drop all the indexes reimport then create them again much quicker
-----Original Message----- From: Paul DuBois [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 09, 2003 1:09 PM To: Florian Weimer; [EMAIL PROTECTED] Subject: Re: Faster reindexing At 9:39 +0200 7/7/03, Florian Weimer wrote: >I've got a table with 100 million rows and need some indexes on it >(one row is 126 bytes). > >I'm currently using MyISAM and the indexing proceeds at an >astonishingly low rate: about 200 MB per hour. This is rate is far >too low; if we had to recover the database for some reason, we'd have >to wait for days. > >The table looks like this: > >CREATE TABLE flows ( > version CHAR NOT NULL, > router CHAR(15) NOT NULL, > src_ip CHAR(15) NOT NULL, > dst_ip CHAR(15) NOT NULL, > protocol TINYINT UNSIGNED NOT NULL, > src_port MEDIUMINT UNSIGNED NOT NULL, > dst_port MEDIUMINT UNSIGNED NOT NULL, > packets INTEGER UNSIGNED NOT NULL, > bytes INTEGER UNSIGNED NOT NULL, > src_if MEDIUMINT UNSIGNED NOT NULL, > dst_if MEDIUMINT UNSIGNED NOT NULL, > src_as MEDIUMINT UNSIGNED NOT NULL, > dst_as MEDIUMINT UNSIGNED NOT NULL, > src_net CHAR(1) NOT NULL, > dst_net CHAR(1) NOT NULL, > direction CHAR(1) NOT NULL, > class CHAR(1) NOT NULL, > start_time CHAR(24), > end_time CHAR(24) >); > >Indexes are created using this statement: > >mysql> ALTER TABLE flows > -> ADD INDEX dst_ip (dst_ip, src_ip), > -> ADD INDEX dst_port (dst_port, start_time), > -> ADD INDEX src_ip (src_ip, start_time), > -> ADD INDEX time (start_time); > >In theory, we could represent the columns router, src_ip, dst_ip, >start_time, end_time using integers of the appropriate size, but this >would make ad-hoc queries harder to type (and porting our applications >would be even more difficult). Perhaps, but as a test, you might add a couple of extra columns to the table, then populate them like this after loading the table: UPDATE flows SET int_src_ip = INET_ATON(src_ip), int_dst_ip = INET_ATON(dst_ip); Then try creating the indexes using int_src_ip and int_dst_ip rather than src_ip and dst_ip. If it's significantly faster, you may want to reconsider whether it might not be worth using INET_ATON(X) in your queries rather than X. > >Should I switch to another table type? It's easy enough to convert the table to, e.g., InnoDB and then create the indexes, so an empirical test should not be difficult. -- Paul DuBois, Senior Technical Writer Madison, Wisconsin, USA MySQL AB, www.mysql.com Are you MySQL certified? http://www.mysql.com/certification/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]