You could load the data into several smaller tables and combine them
into a merged table which would have no real effect on the schema.
Ade
Simon Collins wrote:
I'm loading the data through the command below mysql -f -u root -p
enwiki < enwiki.sql
The version is MySQL 5.0.51a-community
I've disabled the primary key, so there are no indexes. The CPU has 2
cores and 2 Gigs memory.
The import fell over overnight with a "table full" error as it hit 1T
(I think this may be a file system problem). As it's not importing
before anymore show status isn't going to provide any interesting info
however, I did notice that mysql was not consuming much CPU time ~ 10%.
I wouldn't like to split the data up into separate tables as it would
change the schema and I'm not in charge of the schema design - just
the DBA at the backend.
Cheers
Simon
mos wrote:
Simon,
As someone else mentioned, how are you loading the data? Can you post
the SQL?
You have an Id field, so is that not the primary key? If so, the
slowdown could be maintaining the index. If so, add up to 30% of your
available ram to your key_bufer_size in your my.cnf file and restart
the server. How much RAM do you have on your machine and how many
CPU's do you have? What version of MySQL are you using? Also can you
post your "Show Status" output after it has started to slow down? How
much CPU is being used after the import slows down?
Now from what you've said, it looks like you are using this table as
a lookup table, so if it just has an id and a blob field, you
probably return the blob field for a given id, correct? If it were up
to me, I would break the data into more manageable tables. If you
have 100 million rows, then I'd break it into 10x10 million row
tables. Table_1 would have id's from 1 to 9,999,999, and table_2 with
id's from 10 million to 10,999,999 etc. Your lookup would call a
stored procedure which determines which table to use based on the Id
it was given. If you really had to search all the tables you can then
use a Merge table based on those 10 tables. I use Merge tables quite
a bit and the performance is quite good.
Mike
At 11:42 AM 6/4/2008, you wrote:
Dear all,
I'm presently trying to import the full wikipedia dump for one of
our research users. Unsurprisingly it's a massive import file (2.7T)
Most of the data is importing into a single MyISAM table which has
an id field and a blob field. There are no constraints / indexes on
this table. We're using an XFS filesystem.
The import starts of quickly but gets increasingly slower as it
progresses, starting off at about 60 G per hour but now the MyISAM
table is ~1TB it's slowed to a load of about 5G per hour. At this
rate the import will not finish for a considerable time, if at all.
Can anyone suggest to me why this is happening and if there's a way
to improve performance. If there's a more suitable list to discuss
this, please let me know.
Regards
Simon
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]