RE: Non-linear degradation in bulk loads?

Jon Frisby Mon, 10 Oct 2005 21:20:45 -0700

Manoj,

Thanks for the reply!  Unfortunately, I failed to note that we don't
have an auto-extending data pool.  Our data pool is about 212GB, in 4GB
chunks, with about 4.5GB free right now.  We extend it explicitly, and
monitor to make sure it doesn't fill up.


-JF 

> -----Original Message-----
> From: Manoj [mailto:[EMAIL PROTECTED] 
> Sent: Monday, October 10, 2005 8:51 PM
> To: Jon Frisby
> Cc: mysql@lists.mysql.com
> Subject: Re: Non-linear degradation in bulk loads?
> 
> Not sure but given that you suffer from non-linear 
> degradation in performance;my guess is you might be extending 
> your ibdata file every too frequently during the batch load 
> process. Check the ibdata_data_file_path variable in my.cnf 
> for more details.
> 
> Cheers
> 
> Manoj
> On 10/11/05, Jon Frisby <[EMAIL PROTECTED]> wrote:
> > Everyone,
> >
> > We're trying to do some bulk data loads on several different tables 
> > (on several different machines, using several different techniques) 
> > and seeing dramatically worse-than-linear performance.
> >
> > We've tried the bulk-INSERT syntax, and the LOAD DATA INFILE syntax.
> > We've done ALTER TABLE ... DISABLE KEYS, SET FOREIGN_KEY_CHECKS=0 
> > (where appropriate), and so forth.
> >
> > The one that is the most immediate concern is a table of the form:
> >
> > CREATE TABLE `test` (
> >  `email` varchar(255) NOT NULL default '',  `when_happened` 
> datetime 
> > NOT NULL default '0000-00-00 00:00:00',  UNIQUE KEY `email` 
> (`email`),  
> > KEY `when_happened` (`when_happened`)
> > ) TYPE=InnoDB;
> >
> > I'm loading data using LOAD DATA INFILE with chunks containing 3.4m 
> > rows each (~135MB files).  The first chunk was very quick 
> (about 1.5 
> > minutes), but the tenth chunk has taken 22.6 hours and is 
> still going.
> > (It's been getting progessively slower with each chunk...)
> >
> > The database is our main sites database but we've 
> dramatically reduced 
> > the load on that machine over the past couple months 
> through careful 
> > optimization of our code.  The box is a dual, dual-core 
> Opteron, 8GB 
> > of RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 
> (32-bit of 
> > course).  We have 1GB allocated to the buffer pool, and our 
> usual 1GB 
> > *
> > 3 log files.  8 I/O threads.
> >
> > Load on the box sits at around 6-7, with a large (>50%) 
> amount of time 
> > spent in wait state, but actual disk throughput to our 
> software RAID 
> > array (No longer on a SAN...) is quite low -- 6-9k blocks/s 
> out, 1-6k 
> > blocks/s in.
> >
> > Something *has* to be wrong here, but we're not sure what 
> we've missed.
> > We've restored larger data sets from a mysqldump in the past in 
> > dramatically less time on far inferior hardware. (A 
> superset of this 
> > same data to a schema which is also a superset, PLUS a 
> bunch of other 
> > rather large tables -- all in ~8 hours on a 3Ware RAID 
> array on a dual 
> > Xeon w/ 4GB of RAM)
> >
> > We're inclined to believe that this is a configuration problem, as 
> > opposed to a driver or hardware problem given the 
> non-linear nature of 
> > the performance degradation.  This implies we're doing 
> something truly 
> > stupid with our loads.  What could cause this kind of strangeness?
> >
> > -JF
> >
> > --
> > MySQL General Mailing List
> > For list archives: http://lists.mysql.com/mysql
> > To unsubscribe:    
> http://lists.mysql.com/[EMAIL PROTECTED]
> >
> >
> 

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

RE: Non-linear degradation in bulk loads?

Reply via email to