Re: public int executeBatch()

Christopher Lamey Fri, 02 Jun 2006 10:34:39 -0700

On Thu, 2006-06-01 at 20:45 -0400, rs wrote:
> 
>  --- On Thu 06/01, Christopher Lamey < [EMAIL PROTECTED] > wrote:
> > FWIW, BatchUpdateException is thrown if there's an exception during a
> > batch update and has enough info for me to figure out what broke when.
> 
> I'd be interested to find out more how it works out for you.
> 
> Say, a batch has 10K records. 300 of them may fail, for various reasons. How 
> do you figure out *why* those records failed? Do you run your batch 300 
> times, fixing one record at a time? Say, you need to give feedback to a 3rd 
> party about failed records and the reasons, how do you do that?


In my current project, because we support different databases that may
or may not have drivers that continue after hitting an error (most do
not), my first approach is to go through each one as they come up.

Since that's a slow process and not helpful for large 3rd party data
sets, there are some validation steps I might do before trying the
inserts.  The validation can be parsing the input file directly and
looking for problems or loading the data first into a flat staging
table.  In some cases I've used the RDBMS' native bulk loaders (sqlldr,
psql 'COPY', bc.exe, etc) to get large data sets (30+ million rows) into
a staging table as fast as possible.  Then I look for problem data in
the staging table data and go from there.

> Also, when one of the records fails, the whole batch will be rolled back. 
> Suppose, some records cannot be fixed but you still want "good" records to be 
> imported. How do you deal with that?

To be honest, since this behavior is driver and RDBMS dependent, I don't
expect or use it.  The update counts are in the same boat, what they
mean changes between vendors and even on the type of sql executed.

Oracle says they halt on the first error, set all the result counts to
-3, and a rollback is required when using prepared statements in a
batch.  Generic statements can be commit()'d up to to the one that
caused the problem and the counts are correct, but generic statements in
a batch are just executed as regular statements:

http://www.lc.leidenuniv.nl/awcourse/oracle/java.920/a96654/oraperf.htm#1059011

Postgres bails on the first exception in a batch as well, and also has
funky update count issues.  Last time I used SQL Server it did the same
thing.

It looks like MySQL can be configured (continueBatchOnError=true) to
continue after the first error and only throw one at the end.  It's not
clear how you would get a list of all the exceptions that happened at
that point and how you'd correlate that to the update counts.  My guess
would be you'd have to loop through the getNextException() and keep a
count?

Maybe for now with iBATIS you could loop through the exceptions using
getNextException() and hope the messages are descriptive enough?

Cheers,
Chris

Re: public int executeBatch()

Reply via email to