Multiple result sets and the $sth->more_results method (Mark 2)

Tim Bunce Sat, 01 May 2004 10:55:20 -0700

Here's an updated version of what I'm thinking of with respect
to $sth->more_results (many thanks for the valuable feedback).


This is quite long because I've tried to consider all the issues.


=head1 Multiple result sets and the $sth->more_results method

There are 3 kinds of batches of SQL statements:

 o Arrays of parameters - where an array of values is bound to
   placeholders in a single statement.
   [This kind of batch is supported via the bind_param_array() and
   execute_array() methods and won't be discussed further here.]

 o Explicit Batches - where the $sql passed to prepare() contains
   multiple statements separated by semicolons.

 o Procedures - where a procedure that contains multiple statements
   is executed AND the database is one that returns the results
   from those embedded statements (e.g. Sybase, MS SQL).

In the text below I'll refer to the individual statements within
a batch as sub-statements.

When a batch is first executed the statement handle appears exactly
as if the batch only contained the first sub-statement. (The only
exception is that the $sth->{NUM_OF_PARAMS} attribute equals the
total number of placeholders in the entire statement.)

The $sth->more_results method is used to move to the results of the
next sub-statement in the batch. If more_results() returns true then
the statement handle appears exactly as if the batch only contained
the next sub-statement. And so on. The $sth->more_results() method
I<appears> to have the effect of preparing and executing the next
sub-statement within the batch.

The more_results() method returns false in these situations:
If there is another sub-statement and that sub-statement fails
then
    The error is recorded (which will trigger RaiseError if enabled).
    more_results() returns 0;
else
    # No more result sets. Either all succeeded or an error has caused
    # execution of later statement to be skipped
    more_results() returns undef; 
end

Because the statement handle always appears as if only one statement
has been executed it follows that:

  $sth->err can be used to tell if there was an error, and
  $sth->{NUM_OF_FIELDS} > 0 if the sub-statement is a SELECT, and
  $sth->rows can be used to get the row count if it isn't
             (but see SQL_BATCH_ROW_COUNT below)

Whether the entire batch of statements is executed before any results
are available, or whether calling more_results() actually triggers
the execution of the next sub-statement in the batch, is undefined.

The state of statement handle attributes after more_results() returns
undef is undefined.

Placeholders work as if the entire statement string was a single statement.
If execute() is called with parameter values then the number of values must
match the total number of placeholders in the entire statement string.

The $h->{BatchOnlySelect} attribute can be used to skip results
from non-SELECT sub-statements.


=head2 Driver Metadata via get_info()

Three get_info() types are relevant here: SQL_MULT_RESULT_SETS,
SQL_BATCH_SUPPORT and SQL_BATCH_ROW_COUNT. I'll use the symbolic
names available via the DBI::Const::GetInfoType module:

  use DBI::Const::GetInfoType;

=head3 SQL_MULT_RESULT_SETS

  my $multresults_support = $dbh->get_info($GetInfoType{SQL_MULT_RESULT_SETS});

returns 'Y' is the driver supports multiple result sets and 'N' otherwise.
The likes of MS SQL Server, Sybase and DB2 return 'Y'.

=head3 SQL_BATCH_SUPPORT

  my $batch_support = $dbh->get_info($GetInfoType{SQL_BATCH_SUPPORT});

This returns a bitmask indicating the drivers support for batches:

 o SQL_BS_SELECT_EXPLICIT:    The driver supports Explicit Batches that can
        have result-set generating statements.

 o SQL_BS_ROW_COUNT_EXPLICIT  The driver supports Explicit Batches that can
        have row-count generating statements.

 o SQL_BS_SELECT_PROC          As above but for Procedures

 o SQL_BS_ROW_COUNT_PROC       As above but for Procedures

The likes of MS SQL Server sets all bits, DB2 sets SELECT_EXPLICT,
ROW_COUNT_EXPLICIT and SELECT_PROC but not COUNT_PROC.


=head3 SQL_BATCH_ROW_COUNT

  my $batch_row_count = $dbh->get_info($GetInfoType{SQL_BATCH_ROW_COUNT});

This returns a bitmask indicating how row counts from consecutive
INSERT, DELETE, or UPDATE statements within the batch are made
available:

 o SQL_BRC_ROLLED_UP
        Row counts for consecutive INSERT, DELETE, or UPDATE
        statements are rolled up into one. If this bit is not set,
        then row counts are available for each individual statement.

 o SQL_BRC_PROCEDURES
        Row counts, if any, are available when a batch is executed
        in a stored procedure. If row counts are available, they
        can be rolled up or individually available, depending on
        the SQL_BRC_ROLLED_UP bit.

 o SQL_BRC_EXPLICIT
        Row counts, if any, are available when a batch is executed
        directly by calling SQLExecute or SQLExecDirect. If row
        counts are available, they can be rolled up or individually
        available, depending on the SQL_BRC_ROLLED_UP bit.


=head2 Example Implementation via DBI Subclass

Another way to think about the behaviour of more_results() is to
imagine a DBI subclass that did something like this:

sub prepare {
    my ($dbh, $Statement) = @_;
    my @statements = split /;/, $Statement;            # split statements
    my $sth = $dbh->SUPER::prepare(shift @statements); # prepare the first
    $sth->{statements} = [EMAIL PROTECTED]                  # store the rest
        if @statements;
    return $sth;
}

sub execute {
    my $sth = shift;
    if ($sth->{statements}) {         # this is a batch
        $sth->{parameters} = [ @_ ];  # store the parameters
        return $sth->_more_results_execute();
    }
    return $sth->SUPER::execute(@_);
}

sub more_results {
    my ($sth) = @_;
    $sth->_more_results_prepare() or return;
    return $sth->_more_results_execute();
}

sub _more_results_prepare {
    my ($sth) = @_;
    my $statements = $sth->{statements}
        or return undef; # doesn't have multiple statements
    next_results:
    my $Statement = shift @$statements
        or return undef; # no more statements left
    $sth->prepare($Statement)
        or return 0;    # prepare failure
    return $sth;
}

sub _more_results_execute {
    my ($sth) = @_;
    execute_next:
    my @params = splice @{ $sth->{parameters} }, 0, $sth->{NUM_OF_PARAMS};
    my $rv = $sth->execute(@params)
        or return 0;    # execute failure
    if ($sth->{BatchOnlySelect} && $sth->{NUM_OF_FIELDS} == 0) {
        # skip this non-select sub-statement
        $sth->_more_results_prepare or return;
        goto execute_next;
    }
    return $rv; # must be true but is otherwise undefined
}

sub finish {
    my ($sth) = @_;
    $sth->SUPER::finish; # discard any unfetched from current sub-statement
    $sth->{statements} = undef; # discard any unexecuted sub-statements
    $sth->{parameters} = undef; # discard any unused parameters
    return 1;
}

Note that SELECT statements which return no rows are never skipped.
Only non-SELECT statements are skipped by $sth->{BatchOnlySelect}.
We can discuss the name and default setting of that attribute later (not now, please).

If a batch of statements update the values of placeholders which
are bound using bind_param_inout() the output values are only
guaranteed to be available after all results have been processed
(ie when more_results() returns undef).  Though some drivers may be
able to update the values sooner, or update them more than once.

[In practice a driver may have to call the 'more_results' function
of the underlying database API at the end of each result set
in order to correctly implement the desired behaviour.]


=head2 Application Examples

The generic loop to fetch multiple result sets is like this:

  $sth->execute;
  do {
      if ($sth->err) {
          ...handle sub-statement error... (if RaiseError not set)
      }
      elsif ($sth->{NUM_OF_FIELDS}) {
          while (@data = $sth->fetchrow_array) {
              ...
          }
      }
      else {
          printf "Non-SELECT: %d rows\n", $sth->rows;
      }
  } while (defined $sth->more_results);

If the application is happy to exit the loop if any sub-statement
trigers an error then the first C<if> and the C<defined> can be removed.
Then the code reduces to:

  $sth->execute or ...;
  do {
      if ($sth->{NUM_OF_FIELDS}) {
          while (@data = $sth->fetchrow_array) {
              ...
          }
      }
      else {
          printf "Non-SELECT: %d rows\n", $sth->rows;
      }
  } while ($sth->more_results);

and if $sth->{BatchOnlySelect} is set then that simplifies down to:

  $sth->execute or ...;
  do {
      while (@data = $sth->fetchrow_array) {
          ...
      }
  } while ($sth->more_results};
 
or even, for the do-loop-phobic, just:

  $sth->execute or ...;
  while ($data = $sth->fetchrow_arrayref
     or $sth->more_results && $data = $sth->fetchrow_arrayref
  ) {
      ...
  }
 
=head2 Related Changes

The selectall_* methods could perhaps be extended to suport an
optional attribute which would make them more_results() and loop.
Naturally that would only be used by applications in cases where
it was useful.


=head2 Issues

Active: the principles above mean that $sth->{Active} goes false
at the end of each set of results. That goes against the definition
of what Active means but I think that's a small issue compared with
the greater consistency elsewhere. A new attribute could be added
to fill the gap if need be.

finish: $sth->finish will discard all pending result sets.
($sth->more_results automatically discards any unfetched
rows from the current sub-statement.)

row errors: row fetch errors (where a row has an error, like a
truncation, but more rows can still be fetched) could be distinguished
previously by the fetch*() method returning false but Active still
being true.  That will still be true.

err: Obviously some errors will be detected and reported by the
original prepare() and execute() but others will not be detected
till later.  As far as possible more_results() is the place to
report any errors that would have been detected and reported by a
corresponding prepare() and execute() of the individual sub-statement.
(Even if the driver has to call it's own databases' more_results()
function at the end of the previous result set, any error should
be stored internally and not reported till the $sth->more_results()
method is called.)

BatchOnlySelect: How to behave if $sth->{BatchOnlySelect}
is true and a non-select statement generates an error?
We could:
a) ignore the error.
b) report the error as if BatchOnlySelect wasn't set.
c) report the error unless $sth->{BatchOnlySelect} >= 2 (my preference)

BatchOnlySelect must work even if a non-select is the first
sub-statement, i.e., the initial execute() must also ignore an
initial non-select.

So BatchOnlySelect must only apply to batches (hence prepare() or
execute() must be aware that they're working with a batch, either
and explicit one or a stored procedure).

If BatchOnlySelect is set and the batch only contains non-select
statements then the initial execute() would return false and not "0E0".


Comments?

Tim.

p.s. Many thanks to Michael Peppler, Jeff Urlwin and Martin Evans for their
valuable research and assistance.

Multiple result sets and the $sth->more_results method (Mark 2)

Reply via email to