Re: Clarification sought on execute_array

Tim Bunce Thu, 20 Jan 2011 01:55:52 -0800

On Wed, Jan 19, 2011 at 10:25:05PM +0000, Martin J. Evans wrote:
> On 19/01/2011 11:47, Tim Bunce wrote:
> >
> >- I think it's reasonable for execute_array() and execute_for_fetch()
> >     to return an error (ie err() true and so trigger RaiseError etc.)
> >     if execution of any of the tuples encountered an error.


Umm. I take that back. I suggest we change the spec from:

    When called in scalar context the execute_array() method returns the
    number of tuples executed, or undef if an error occurred.
    Like execute(), a successful execute_array() always returns true
    regardless of the number of tuples executed, even if it's zero.
    If there were any errors the ArrayTupleStatus array can be used to
    discover which tuples failed and with what errors.

to either (new or changed text in capitals):

a:  When called in scalar context the execute_array() method returns the
    number of tuples executed INCLUDING ANY TUPLES WHICH FAILED,
    or undef if an error occurred THAT PREVENTED NORMAL OPERATION OF THE
    METHOD. THE FAILURE OF SOME OR ALL TUPLES IS NOT REGARDED AS AN ERROR.
    Like execute(), a successful execute_array() always returns true
    regardless of the number of tuples executed, even if it's zero.
    If there were any errors the ArrayTupleStatus array can be used to
    discover which tuples failed and with what errors.

or

b:  When called in scalar context the execute_array() method returns the
    number of tuples executed IF ALL ARE SUCCESSFUL. IF ANY TUPLES
    ENCOUNTER AN ERROR THEN execute_array() RETURNS UNDEF.
    Like execute(), a successful execute_array() always returns true
    regardless of the number of tuples executed, even if it's zero.
    If there were any errors the ArrayTupleStatus array can be used to
    discover which tuples failed and with what errors.

I prefer (a) but (b) is how the DBI's default execute_array() and
execute_for_fetch() behave, which is obviously significant.

> The way I see it is that if some of the rows in the batch end up in
> the table and some don't I'd expect a warning. However, if 1 or more
> rows fail and no rows end up in the table I don't see any success so
> I'd expect an error. In my example code, 1 row fails but no rows end
> up successful so it is the latter.

Shouldn't the number of rows "failing" and the number not ending up in
the database be the same? Anything else seems like a bug. Or are you
refering to a batch being aborted early?

Regarding the batch being aborted early, how about this new spec?:

    Some databases may abort the batch early after one or more errors.
    In which case the number of elements in ArrayTupleStatus, would be
    less than expected.

(That's assuming we go with (b) above. If (a) then execute_array() would
return the reduced tuple count.)

> It may be a change/bug in the Oracle libraries as I thought when you
> execute a batch all the rows are executed no matter if some fail and
> all successful ones end up in the table.

I wonder if we could run an older DBD::Oracle against an Oracle 9 (say)
database to recreate the (presumably) original behaviour.

> >- That'll need a spec clarification and a clear warning in Changes.
> >
> >- We should avoid breaking existing batch apps that use DBD::Oracle.

> >- I'm hopeful that the above change wouldn't. (John?)
> >
> So am I. If we could get the examples John has seen via his
> DBD::Oracle maintainership or via Pythian customers I would be happy
> to create test cases. We are running blind at the moment as we've
> not got those solid examples of supposedly working code.

Agreed. John, can you get some working examples so we can see what
error handling is used and what the impact of any changes above would be?

> >- We should review other database APIs that provide batch execution
> >     in order to spec a reasonable common subset behaviour for the DBI.
> >
> Obviously ODBC and JDBC do batched statements. I might provide a
> JDBC example but for now I've done an ODBC example (slightly more
> familiar to me) - see below.

ODBC is handy, thanks. I'd also like to see how more native database
APIs handle these issues. Anyone know what other native database APIs
support batches?

> >- Clearly we should bring DBD::Oracle, the DBI default behaviour, and the
> >     DBI spec into agreement with each other.
> >
> Exactly. This is really my main point. As it stands (and given it is
> not a bug in Oracle) I see a nightmare for anyone trying to use
> execute_array in a database neutral way as with DBI, all successful
> rows are inserted and we know which ones failed and with DBD::Oracle
> no rows are inserted (including the ones where there is no error)
> and you cannot commit the good ones and it is difficult to know (if
> not impossible) what really happened. This is not a dig at anyone in
> particular as I added the array context execute_array to
> DBI/DBD::Oracle but this is also why I suspect something has changed
> in DBD::Oracle/Oracle.

Not being able to tell what happened is clearly a bug in DBD::Oracle.

> >- We *really* need a way to share tests across drivers.
> >     Perhaps something like a separate DBI::TestSuite distro that the DBI
> >     and drivers could have as a prerequisite. That would contain tests in
> >     modules.  The DBI and DBDs would have a test file that uses the
> >     DBI::TestSuite module and calls a function that runs the tests.
> >     This issue could provide the first test.
> >
> >Tim.
> 
> I agree and I seem to remember a project to do something like this -
> was it perhaps a google summer of code suggestion? But it is pretty
> difficult and I think that puts a lot of people off. I briefly
> looked at Test::Database so I could get more realistic test results
> for DBD::ODBC but I ran in to a load of problems as Test::Database
> needs some DBD methods writing and expects to be able to create a
> database and in ODBC (via dozens of ODBC drivers) there is not
> single way to do this. The gain was just not worth the pain for me.
> I'd be happy to help someone do this but only in a minor way as
> right now I cannot find the time to satisfy even half of my OS
> commitments (as an example, I REALLY want to be able to set handle
> attributes on methods in DBI [post from a week back] but I just
> cannot find time to do it - something else is always cropping up).

Having Test::Database know how to work with only a very small set of
common ODBC drivers would still be a big help.

> It has 4 tests:
> 
> 1. insert a batch successfully reading the parameter status array
> (ArrayTupleStatus) to see what worked (autocommit)
> 2. insert a batch where 2 rows cannot be inserted because of a
> duplicate key and with a parameter status array (autocommit)
> 3. insert a batch where 2 rows cannot be inserted because of a
> duplicate key and without a parameter status array (autocommit)
> 4. insert a batch where 2 rows cannot be inserted with an explicit
> txn and with a parameter status array

> What it tells us is:
> 
> o SQL_SUCCESS is always returned when all the rows are inserted
> o SQL_SUCCESS_WITH_INFO is returned if some of the rows were
> successful but some were not

Which corresponds with (a) above, but doesn't match existing DBI default
behaviour (b).

> o it does not matter whether we provide a parameter status array
> (ArrayTupleStatus) or not - the 2 results above stand i.e. even if
> the ODBC driver cannot tell you which ones failed (because you did
> not give a parameter status array) it still does the successful rows
> and only returns SQL_SUCCESS_WITH_INFO if some failed.
> o AutoCommit makes no difference - i.e., if auto commit is on or off
> the end result is the same IF we commit afterwards.

Tim.

Re: Clarification sought on execute_array

Reply via email to