Fellow DBI Users, Tim Bunce has asked me to help revise the NULLs subsection in the "Placeholders and Bind Values" section in the DBI documention for DBI 1.47. Attached is the proposed text to be included, along with an associated perl script. You are invited to review them. Please send comments back directly to me (not the DBI users mailing list). I'll post a summary in January. The revision addresses the issue of testing for NULL values in a WHERE clause. There's no "one size fits all" solution, so several options are given. In particular, several examples of WHERE clauses are presented that can support both NULL and non-NULL tests. But the examples tend to rely on database specific features or how flexibly they support placeholders (e.g. Informix supports the NVL function but not with a placeholder).
Near the bottom of the attached text is a table that lists some database engines and which WHERE clause examples work with them. The table is sparse. Only Informix has been properly tested so far. Any other entries you see are educated guesses. You can help us fill out the table. How? Use the attached perl script that you can run to see which examples actually work on your favorite database engine. I invite you to run it and send the results back to me (not the DBI users mailing list). I'll post a summary. View the top of the perl script for specific instructions. You'll likely want to change the suffix of the script's file name to .pl.
Brian Campbell (E-mail).vcf
Description: Binary data
B<NULL Values>
Undefined values, or C<undef>, are used to indicate NULL values.
You can insert update columns with a NULL value as you would a
non-NULL value. These examples insert and update the column
C<age> with a NULL value:
$sth = $dbh->prepare(qq{
INSERT INTO people (name, age) VALUES (?, ?)
});
$sth->execute("Joe Bloggs", undef);
$sth = $dbh->prepare(qq{
UPDATE people SET age = ? WHERE name = ?
});
$sth->execute(undef, "Joe Bloggs");
However, care must be taken in the particular case of trying to use
NULL values to qualify a C<WHERE> clause. Consider:
SELECT name FROM people WHERE age = ?
Binding an C<undef> (NULL) to the placeholder will I<not> select rows
which have a NULL C<age>! At least for database engines that
conform to the SQL standard. Refer to the SQL manual for your database
engine or any SQL book for the reasons for this. To explicitly select
NULLs you have to say "C<WHERE age IS NULL>".
A common issue is to have a code fragment handle a value that could be
either C<defined> or C<undef> (non-NULL or NULL) at runtime.
A simple technique is to prepare the appropriate statement as needed,
and substitute the placeholder for non-NULL cases:
$sql_clause = defined $age? "age = ?" : "age IS NULL";
$sth = $dbh->prepare(qq{
SELECT name FROM people WHERE $sql_clause
});
$sth->execute(defined $age ? $age : ());
The following technique illustrates qualifying a C<WHERE> clause with
several columns, whose associated values (C<defined> or C<undef>) are
in a hash %h:
for my $col ("age", "phone", "email")
{
if (defined $h{$col})
{push @sql_qual, "$col = ?"; push @sql_bind, $h{$col};}
else
{push @sql_qual, "$col IS NULL";}
}
$sql_clause = join(" AND ", @sql_qual);
$sth = $dbh->prepare(qq{
SELECT name FROM people WHERE $sql_clause
});
$sth->execute(@sql_bind);
The techniques above prepare the SQL statement with each call to
execute. Because calls to prepare can be expensive, performance
can suffer when an application iterates many times over statements
like the above.
A better solution is a single C<WHERE> clause that supports both
NULL and non-NULL comparisons. Its SQL statement would need to be
prepared only once for all cases, thus improving performance.
Several examples of C<WHERE> clauses that support this are presented
below. But each example lacks portability, robustness, or simplicity.
Whether an example is supported on your database engine depends on
what SQL extensions it provides, and where it supports the C<?>
placeholder in a statement.
0) age = ?
1) NVL(age, xx) = NVL(?, xx)
2) ISNULL(age, xx) = ISNULL(?, xx)
3) DECODE(age, ?, 1, 0) = 1
4) age = ? OR (age IS NULL AND ? IS NULL)
5) age = ? OR (age IS NULL AND SP_ISNULL(?) = 1)
6) age = ? OR (age IS NULL AND ? = 1)
Statements formed with the above C<WHERE> clauses require execute
statements as follows. The arguments are required, whether their
values are C<defined> or C<undef>.
0-3) $sth->execute($age);
4,5) $sth->execute($age, $age);
6) $sth->execute($age, defined($age)?0:1);
Example 0 should not work (as mentioned earlier), but may work on
a few database engines anyway (e.g. Sybase).
Examples 1 and 2 are not robust: they require that you provide a
valid column value xx (e.g. '~') which is not present in any row.
That means you must have some notion of what data won't be stored
in the column, and expect clients to adhere to that.
Example 5 requires that you provide a stored procedure (SP_ISNULL
in this example) that acts as a function: it checks whether a value
is null, and returns 1 if it is, or 0 if not.
Example 6, the least simple, is probably the most portable, i.e., it
should work with with most, if not all, database engines.
Here is a table that indicates which examples above are known to
work on various database engines:
-----Examples------
0 1 2 3 4 5 6
- - - - - - -
Oracle 9 N Y
Informix IDS 9 N N N Y N Y Y
MS SQL N Y Y Y
DB2
Sybase Y N
MySQL 4
DBI provides a sample perl script that will test the examples above
on your database engine and tell you which ones work. It is located
in the F<ex/> subdirectory of the DBI source distribution, or here:
L<http://svn.perl.org/modules/dbi/trunk/ex/perl_dbi_nulls_test.pl>
Please use the script to help us fill-in and maintain this table.
#! /usr/bin/perl -w
# This script checks which style of WHERE clause(s) will support both
# null and non-null values. Refer to the NULL Values sub-section
# of the "Placeholders and Bind Values" section in the DBI
# documention for more information on this issue. The clause styles
# and their numbering (0-6) map directly to the examples in the
# documentation.
#
# To use this script, update the connect method arguments to support
# your database engine and database. Set PrintError to 1 if you want
# see the reason WHY your engine won't support a particular style.
#
# To properly test style #5, you need the capability to create the
# stored procedure SP_ISNULL that acts as a function: it tests its
# argument and returns 1 if it is null, 0 otherwise. For example,
# using Informix, a definition would look like:
#
# CREATE PROCEDURE SP_ISNULL (arg VARCHAR(32)) RETURNING INTEGER;
# IF arg IS NULL THEN RETURN 1;
# ELSE RETURN 0;
# END IF;
# END PROCEDURE;
#
# Warning: This script will attempt to create a table named by the
# $tablename variable (default dbi__null_test_tmp) and WILL DESTROY
# any pre-existing table so named.
use strict;
use DBI;
my $tablename = "dbi__null_test_tmp"; # DESTROYs this table. Must be obscure
die "DBI_DSN environment variable not defined"
unless $ENV{DBI_DSN};
my $dbh = DBI->connect(undef, undef, undef, {
RaiseError => 0,
PrintError => 1
}
);
printf "Using %s, db version %s\n", $ENV{DBI_DSN}, $dbh->get_info(18) ||
"unknown";
my $sth;
my @ok;
print "=> Drop table '$tablename', if it already exists...\n";
$sth = $dbh->do("DROP TABLE $tablename");
print "=> Create table '$tablename'...\n";
$sth = $dbh->prepare("CREATE TABLE $tablename (key int, mycol char(8))");
$sth->execute();
print "=> Insert 4 rows into the table...\n";
my @stv = ('slow', undef, 'quick', undef);
$sth = $dbh->prepare("INSERT INTO $tablename (key, mycol) VALUES (?,?)");
for my $i (0..3)
{
$sth->execute($i+1, $stv[$i]);
}
# Define the SQL statements with the various WHERE clause styles we want to
test.
my @sel = (
qq{WHERE mycol = ?},
qq{WHERE NVL(mycol, '-') = NVL(?, '-')},
qq{WHERE ISNULL(mycol, '-') = ISNULL(?, '-')},
qq{WHERE DECODE(mycol, ?, 1, 0) = 1},
qq{WHERE mycol = ? OR (mycol IS NULL AND ? IS NULL)},
qq{WHERE mycol = ? OR (mycol IS NULL AND SP_ISNULL(?) = 1)},
qq{WHERE mycol = ? OR (mycol IS NULL AND ? = 1)},
);
# Define the execute method argument lists for non-null values.
# The order must map one to one with the above SQL statements.
my @nonnull_args = (
['quick'],
['quick'],
['quick'],
['quick'],
['quick','quick'],
['quick','quick'],
['quick', 0],
);
# Define the execute method argument lists for null values.
# The order must map one to one with the above SQL statements.
my @null_args = (
[undef],
[undef],
[undef],
[undef],
[undef, undef],
[undef, undef],
[undef, 1],
);
# Run the tests...
for my $i ([EMAIL PROTECTED])
{
print "\n=> Testing clause style $i: $sel[$i]\n";
$sth = $dbh->prepare("SELECT key,mycol FROM $tablename $sel[$i]")
or next;
$sth->execute(@{$nonnull_args[$i]})
or next;
my $r1 = $sth->fetchall_arrayref();
my $n1 = $sth->rows;
$sth->execute(@{$null_args[$i]})
or next;
my $r2 = $sth->fetchall_arrayref();
my $n2 = $sth->rows;
# Did we get back the expected "n"umber of rows?
# Did we get back the specific "r"ows we expected as identifed by the key
column?
if ( $n1 == 1
&& $n2 == 2
&& $r1->[0][0] == 3
&& $r2->[0][0] == 2
&& $r2->[1][0] == 4)
{
print "=> WHERE clause style $i is supported.\n";
push @ok, "$i: $sel[$i]";
}
else {
print "=> WHERE clause style $i returned incorrect results.\n";
}
}
$dbh->disconnect();
printf "\n%d styles are supported\n", scalar @ok;
print "$_\n" for @ok;
print "\n";
