Re: Large multi-record insert performance

Daniel Noll Sun, 18 Mar 2007 14:16:20 -0800

Mike wrote:

As you can see in the results Derby currently does not optimize the

batch processing in embedded so if you are embedded I would use codesimilar to

what is included below.  The biggest performance win is going to be
to insert as many rows as possible between each commit.  Each commit
will do a synchronous I/O and wait for it to hit disk, so on modern
processors you will quickly become I/O wait bound unless you make
commits big enough. "Big enough" depends on the data, for instance
below it looks like the optimal number may be bigger than 100000,
but the difference between 10k and 100k is not much (6250 rows/sec for
10k, and 6924 rows/sec).

There is a benefit to doing it all in a single commit anyway, namely that ifthe process crashes, someone kills it, or the machine powers out, thedatabase will be in a state where it's as if nothing occurred.

I need to rerun these tests with a networked database but setting that upisn't as trivial as a unit test. :-)

Also the benefits of doing it for networked mode probably vary greatlydepending on the network itself.


Daniel Noll

Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia    Ph: +61 2 9280 0699
Web: http://nuix.com/                               Fax: +61 2 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.

Re: Large multi-record insert performance

Reply via email to