Re: Large import into MYISAM - performance problems

Simon Collins Thu, 05 Jun 2008 01:35:26 -0700

I'm loading the data through the command below mysql -f -u root -penwiki < enwiki.sql


The version is MySQL 5.0.51a-community

I've disabled the primary key, so there are no indexes. The CPU has 2cores and 2 Gigs memory.

The import fell over overnight with a "table full" error as it hit 1T (Ithink this may be a file system problem). As it's not importing beforeanymore show status isn't going to provide any interesting info however,I did notice that mysql was not consuming much CPU time ~ 10%.

I wouldn't like to split the data up into separate tables as it wouldchange the schema and I'm not in charge of the schema design - just theDBA at the backend.


Cheers

Simon

mos wrote:

Simon,
As someone else mentioned, how are you loading the data? Can you postthe SQL?
You have an Id field, so is that not the primary key? If so, theslowdown could be maintaining the index. If so, add up to 30% of youravailable ram to your key_bufer_size in your my.cnf file and restartthe server. How much RAM do you have on your machine and how manyCPU's do you have? What version of MySQL are you using? Also can youpost your "Show Status" output after it has started to slow down? Howmuch CPU is being used after the import slows down?
Now from what you've said, it looks like you are using this table as alookup table, so if it just has an id and a blob field, you probablyreturn the blob field for a given id, correct? If it were up to me, Iwould break the data into more manageable tables. If you have 100million rows, then I'd break it into 10x10 million row tables. Table_1would have id's from 1 to 9,999,999, and table_2 with id's from 10million to 10,999,999 etc. Your lookup would call a stored procedurewhich determines which table to use based on the Id it was given. Ifyou really had to search all the tables you can then use a Merge tablebased on those 10 tables. I use Merge tables quite a bit and theperformance is quite good.
Mike

At 11:42 AM 6/4/2008, you wrote:
Dear all,
I'm presently trying to import the full wikipedia dump for one of ourresearch users. Unsurprisingly it's a massive import file (2.7T)
Most of the data is importing into a single MyISAM table which has anid field and a blob field. There are no constraints / indexes on thistable. We're using an XFS filesystem.
The import starts of quickly but gets increasingly slower as itprogresses, starting off at about 60 G per hour but now the MyISAMtable is ~1TB it's slowed to a load of about 5G per hour. At thisrate the import will not finish for a considerable time, if at all.
Can anyone suggest to me why this is happening and if there's a wayto improve performance. If there's a more suitable list to discussthis, please let me know.
Regards

Simon



--
Dr Simon Collins
Data Grid Consultant
National Grid Service
University of Manchester
Research Computing Services
Kilburn Building
Oxford Road
Manchester
M13 9PL

Tel 0161 275 0604


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Large import into MYISAM - performance problems

Reply via email to