Well, in my program that does this, I'm not doing any batching because the data is not reliable enough for me to guarantee that everything in the batch will load without some constraint violation. So every insert/update is a separate transaction. You cannot accomplish this performance by simply reading a file and loading it with straight JDBC or iBatis.
The only way you accomplish this is to make your application multithreaded. So what I did was to read from the file and store a certain number of rows into a Queue and then spawn a thread to work on that queue to do inserts/updates. Both the number of items in each queue and the max number of threads to work on a queue are defined in a properties file. So you do have to do some optimizations to discover what those parameters should be to reach optimal performance. I was quite surpised at my results because in my code I'm also using some beanutils to cast things into my Bean objects using PropertyUtils() for every row in the file. Anyways, I hope this helps and if you have more questions on how to do this then please let me know. Thanks, Brent -----Original Message----- From: Christopher Lamey [mailto:[EMAIL PROTECTED] Sent: Friday, April 28, 2006 9:52 AM To: [email protected] Subject: Re: iBatis performance FYI On Fri, 2006-04-28 at 10:07 -0600, Larry Meadors wrote: > That is not to bad, for reference, I did a jdbc import of 180,674 > records in 186.620s, which is only 1032/second. > > Any tips learned along the way that you'd like to share? Hello all, I would be very interested in hearing anyone's experience here. I've got to cram millions of records into a postgres db as fast as possible. I don't have test machines setup yet, but on my laptop 3G P4 I'm seeing encouraging numbers. Roughly around 3.7ms per input file line, each of which breaks down into somewhere around 10 to 20 inserts and an update. Ideally I'd use the psql COPY command, which looks to bypasses the SQL parser and so is extremely fast (like Oracle's sqlldr or sybase/sql server's bc), but I need to do programmatic things with the data. I've dropped as many constraints (keys and indexes) as I can and tuned postgres for inserts. The postgres tuning involved increasing the right values so that lots of inserts don't cause lots of sync and I/O operations. iBATIS-wise, I'm batching INSERT statements using a runtime parameter for the batch size. A batch size of 100 seems to be working well for me, but this is dependent on the database and machine configuration so I'll tune it on more realistic hardware. Other than batches, are there iBATIS features at which I should be looking for INSERT performance? Cheers, topher This e-mail is intended only for the personal and confidential use of the recipient(s) named above. It may include Blackboard confidential and proprietary information, and is not for redistribution.
